access management

Sandboxes instead of walls


The mission of date warehouses (DWH) is to be the source of “undoubted” information. To have the “single version of truth” they operate with strict rules. However, this leads to the problem that they are not able to adapt to a rapidly changing business environment. Therefore most of the time it is essential to attach some “explanations” or “corrections” to the reports created from the DWH to amend it with information from other sources.  No wonder that executive reports are usually created manually in MS Excel or PowerPoint. In this case, careful hands add important information from outside the regulated DWH to the DWH reports. The concept of “DWHs collect all information” is nothing else, but: an utopia.

Therefore, it can be said that managing a company based on “hard” data (i.e. standard reports) only, is not possible. However, decisions should not be based exclusively on “soft” data (i.e. coming from unregulated channels).

The optimal solution is to take both sources into account.

If IT departments were willing to admit that both “hard” and “soft” worlds have solid grounds, they could support the union of the two by providing “sandboxes” for the various business units.

Business units would be able to load their own data into the sandbox without any regulationand would be able to access even the DWH. No more need for data “downloads” or “exports” from the DWH, no more need for mediocre crafts – e.g. think about the vlookup function in Excel – to merge the two sources.

Clearly, sandboxes are server-side environments with a performance that is better than desktop computers’. It’s obvious that these environments have to be administered (CPU performance, storage distribution, version upgrades…) but the IT department does not take responsibility over the content. Sandboxes could also include Big Data infrastructure giving the possibility to business users to get familiar with unstructured data as well.

It is important to look after the work in the sandbox – both the technical and organizational aspects – to notice if different units are working on similar tasks. In this case, coordination of these approaches are necessary.

It is not from the devil to place “professional” scheduler functions similar to professional ETL tools into the sandbox to support the regularly executed commands. Running a process put together by business analysts every night or every Monday morning fits into the concept of sandboxes.  Obviously, the maintenance and error handling of these processes are managed by the business users.

It must also be recognized that sandboxes come with risks of data security and access management. As the data structure is not as strict as of DWHs, access management is also more unbound. The problem of one of the users having access to data relevant only to his region is hard to resolve. On the other hand, there is less probability that data will end up outside the regulated areas as attached to emails, on shared drives, etc.

If the sandbox is created and operated well enough, the big moment will come. Success is – as I interpret –when polished data manipulating processes advance into the DWHs by the business units. This demonstrates that pure purpose overcomes the power oriented aspect: a business unit shares its results with the whole organization.

With this, corporate data assets are enriched by something really valuable. And explorers can take another step further.

Hiflylabs creates business value from data. The core of the team has been working together for 15 years, currently with more than 50 passionate employees.

The picture is created by Johan Eklund. (

Dangerous little elephant?


The elephant is said to be the most dangerous animal on the African Savanna. Nevertheless, the icon of the quasi standard Big Data processing tool Hadoop is a cute little elephant (by the way, the technology was named after a stuffed animal of the son of Doug Cutting, one of the inventors of Hadoop). Yet, in my experience, most corporations are cautious about, or even afraid of this new technology.

However, Hadoop based systems can provide computing capacity for a fraction of the cost compared to traditional suppliers.

Why do companies not jump at this opportunity? After many conversations and a few pilot projects, I came to the following conclusions:

  • Access management

With a little exaggeration, Hadoop allows basically anybody to access any data. Or one may have no access at all. Companies prefer more sophisticated access management than this.

  • Operational management

For most IT operational management teams, keeping track of the versions of their current technology is a challenge in itself. In addition, they are afraid of a differently operating system. Instead of the usual supplier support, Hadoop is mostly free but offers no support. (In contrast, Cloudera and Hortonworks, which, by the way, also have Hungarian developer teams, provide support service but not for the Hungarian IT budgets.)

  • User interface

Currently, specified knowledge is necessary for the upload and query of data using Hadoop. Although many are working on the creation of an easy-to-use interface for Hadoop, it is probably for this reason that no standard has yet emerged from the different solutions.

Nowadays, traditional companies get seriously interested in Hadoop only when some circumstances require them to change the current technology. If there is no such influence, Hadoop usually stays in the “interesting experimentation” category. Only the most committed ones belong to the exceptions.

Meanwhile, the big database vendors (MicrosoftOracleIBMSAPSAS, etc.) are all working on the taming of the Hadoop elephant, and I think they will be successful.


Hadoop is 10 years old. I am convinced that by the time it reaches adolescence, it will understand the world of business much better and will change it as well…

Hiflylabs creates business value from data. The core of the team has been working together for 15 years, currently with more than 50 passionate employees.