Machine generated data

At first, the term "machine-generated  data" can be confusing. One would think,  every data is (or are?)  generated from one device or another is provided by an innocent mortal in this so called era of social media and big data. Then, there should be a clear distinction to such definitions. If an user enter some data in a form, then it is not considered machine generated. At the same time, the same application can track the user's location and log it in a remote server. So it becomes the machine generated data.

Wikipedia says,

Machine-generated data (MGD) is the generic term for information which was automatically created from a computer process, application, or other machine without the intervention of a human. 

According to Monash Research,

In classical human-generated data, what’s recorded is the direct result of human choices. Somebody buys something, makes an inquiry about it, fills an order from inventory, makes a payment in return for the object, makes a bank deposit to have funds for the next purchase, or promotes a manager who’s been particularly successful at selling stuff. Database updates ensue. Computers memorialize these human actions more quickly and cheaply than humans carry them out. Plenty of difficulties can occur with that kind of automation — applications are commonly too inflexible or confusing — but keeping up with data volumes is generally the least of the problems.

So what are they? Are they stream of logs flowing through the information super waterway?

May be, until they churned into some books or toilet rolls!

Application Logs - Logs generated by web or desktop applications. The server side logs used for debugging and support tickets!

Call Detail Records - The ones recorded your telecom company. They contain useful details of the call or service that passed through the switch etc like the phone number of the calls, its duration etc. Needed for billing.

Web logs - use to count the visitors and similar web analytics done on these data

Database Audit Logs - Enable auditing to audit for suspicious database activity, it is common that not much information is available to target specific users or schema objects

OS logs - tracks crashing or errors

There are many similar generated data by different application and systems like RFIDs, sensors etc. Then these messages can be mashed up. For the machine data, there will be structure or format and semantics based on the domain it relies on.

The growth of such data is fast and continuous. As it is a stream of data and like a history they are not changed. They are like a record of events.

courtesy- link

 Anyone tried iPhonetracker?

courtesy- link

Geolocation and LBS does push a load a data. HTML5 do have a geolocation functionality (even though you have the choice not to track). Following a sample code to test it.

No comments:

Post a Comment