Information Infrastructure EII TCO/ROI Hardware Uncategorized Green IT Development
A couple of recent posts noted the advantages and disadvantages of a policy of customer support involving BYOD, or Bring Your Own Device. The discussion revealed that support involved not only allowing the laptop to hook into the corporate network and access corporate apps, but also allowing “non-standard” software inside corporate boundaries: BYOS, or Bring Your Own Software. However, no one asked the obvious question: what about the data on that laptop, or the data on the Web that the laptop could access but corporate did not? Should IT support Bring Your Own Data?
This is not just a theoretical question. A recent missive from BI supplier MicroStrategy invited prospective customers to load their own spreadsheets on its cloud offering, and play around. I doubt that MicroStrategy would have suggested this if there were not significant numbers of business users out there with their own business-related data who would like to analyze it with corporate-style BI tools. One can think of other applications: OLAP (online analytical processing) on spreadsheets, beloved of CFOs and CMOs; business performance management, with corporate “what-if” scenario data lugged around by mobile executives; or simple, cheap cloud sales tools for direct sales, syncing with corporate databases.
And that may be the tip of the iceberg. After all, knowledge workers today are typically not encouraged to go out there and collect data to be analyzed. And yet, the amount of data in the Web about competitors and customers not captured by corporate data stores continues to grow. One survey a while ago estimated that 1/3 of relevant new data on the Web is not ingested by IT data stores for a year or more after it arrives. Proactive encouragement might allow “worker crowdsourcing” of this type of data. For example, communities within a computer vendor for finding out about customer computer performance would allow employees to bring their own customer-report data inside the enterprise.
So, IT support of BYOData is worth considering. But that leads to the next point: BYOData is significantly different from all previous data sources. It is consumer data; it is of widely differing types; it has typically not been supported to the same level before; and a greater proportion of it comes from consumer apps – which may or may not use the same databases or file managers as businesses do, and which often are not set up to export data in common formats. We’re not talking spreadsheets here; we’re talking GPS data collected by iPhone apps.
Data integration tools are the obvious tools for allowing careful merging of existing enterprise data and BYOData. And among those tools, I would argue, Data Virtualization (DV) tools are the best of the best.
Data Virtualization vs. Other Alternatives
Data virtualization tools are an obvious candidate to handle unusual data types and allow querying across BYOData and relational or semi-structured (emails, corporate documents) corporate data. After all, DV tools were designed from the beginning to be highly flexible in the types of data they supported, and to provide a user interface that allowed meaningful combination of multiple data types on-the-fly (i.e., for real-time querying). But there are two other obvious candidates for combining BOYData and corporate relational data warehouses: Enterprise Application Integration (EAI) and ETL (Extract, Transform, Load) tools.
EAI tools are best at combining data from multiple enterprise databases, such as Oracle Apps, SAP, Oracle Peoplesoft, and Oracle Siebel. Over the years, their abilities have been extended to other corporate data, such as IBM’s use of Ascential for its Master Data Management (MDM) product. However, they are much less likely to support non-corporate Web data, and they have less experience in delivering user-friendly interfaces to combined relational and semi-structured or unstructured data.
ETL tools are highly skilled at taking operational enterprise data that has already been massaged into a corporate-standard format and merging it for use in a data warehouse. This also means careful protection of the data warehouse from poor-quality data – not an insignificant concern when one considers the poor quality of much BYOData. However, ETL tools are also unlikely to provide full support for non-corporate Web data, and are not tasked at all with delivering user-friendly interfaces to non-relational data outside of the data warehouse.
Two concepts from programming appear relevant here: multi-tenancy, and the “sandbox.” A multi-tenant application provides separate corporate “states” over common application code, ensuring not only security but effective use of common code. A “sandbox” is a separated environment for programmers so that they can test programs in the real world without endangering operational software. In the case of BYOData, the data warehouse could be thought of as containing multiple personal “tenant data stores” that appear to each end user as if they are part of the overall data-warehouse data store, but which are actually separated into “data sandboxes” until administrators decide they are safe to incorporate into the corporate warehouse.
Assuming that IT takes this approach to handling BYOData, DV tools are the logical tools to bridge each “data sandbox” with corporate data, whether said corporate data resides in operational data stores, line-of-business file collections, or data-mart data stores. At the same time, ETL tools are the logical place to start when IT starts to plunder these BYOData sets for enterprise insights by staging the personal data into corporate data stores. When run-the-business applications can benefit from BYOData, EAI and MDM tools are the logical place to start in staging this data into application data stores.
But whether IT takes this approach or not, DV tools are the logical overall organizer of BYOData – because they are so flexible. BYOData types will be evolving rapidly over the next couple of years, as social media continue to jump from fad to fad. Only tools that are cross-organizational, like DV tools, can hope to keep up.
The IT Bottom Line
The idea of BYOData is a bit speculative. Nothing says that either software vendors or IT shops will embrace the concept – although it appears to offer substantial benefits to the enterprise.
However, if they do follow the idea of Bring Your Own Device to its logical conclusion … then BYOData via DV tools is an excellent way to go.
When IBM announced that the next rev of DB2 would support Oracle PL/SQL stored procedures, native, it seemed like a good deal. As I understand it, suppose I want to move an Oracle database function onto DB2. Before, I would have had to rewrite the PL/SQL stored procedure for the business rule in DB2’s language; now, I just recompile the SP I had on the Oracle machine. Then I can redirect any apps that use the Oracle SP to DB2, which is less of a modification than also changing the SP call in the app. It’s native, so there’s little degradation in performance. Neat, yes?
Well, yes, except that (a) in many cases, I’ve just duplicated much of the functionality of a similar existing DB2 SP, and (b) I don’t care how native it is, it doesn’t run as well as the DB2-language SP – after all, that language was designed for DB2 25 years ago. If we do this a lot, we create a situation in which we have SP A on Oracle, plus SP A and A’ (the DB2-language one) on DB2, all of which have to be upgraded together, for lots and lots of As. We’ve already got SP proliferation, with no records of which apps are calling the SP and hence a difficult app modification when we change the SP; now you want to make it worse?
This kind of thinking is outdated. It assumes the old 1990s environments of one app per database and one database vendor per enterprise. Those days are gone; now, in most situations, you have multiple databases from multiple vendors, each servicing more than one app (and composite app!).
So what’s the answer? Well, in cases like this, I tend to support the idea of “one interface to choke”. The idea is that there should be one standard SP language, with adapters to convert to multiple database SP source and compiled code – a “many to one to many” model. I send a PL/SQL SP to the converter, it puts it in standard language, then converts it to all “target” languages. It then sends it down to all target databases, where each compiles it and has it available.
Now suppose I invoke an SP originally written in PL/SQL, but it is to be applied to a DB2 database. The converter intercepts the PL/SQL-type invocation and converts it to a DB2-type format, and then (because I have told it to) redirects the invocation to the DB2 database. The database runs the DB2-type SP version, and returns the result to the converter, which performs a reverse conversion to deliver the result back to the invoking app.
The cool thing about this is that when you migrate or copy an SP, no app needs to be changed – remember, you simply tell the converter to redirect the invocation. The usual objection is, yes, but the converted SP doesn’t run as well as one written for the target database. True (actually, in the real world sometimes not true!), but the difference is marginal, and, in case you hadn’t noticed, there’s less pressure these days to get the absolute maximum out of a SP, with so many other performance knobs to twiddle.
Where’s the best place to set up such a converter? Why, an EII (Enterprise Information Integration) tool! It has the architecture – it takes SQL/XQL and translates/adapts for various databases, then reverses the translation for the results. Yes, we have to add stored procedure conversion to its tasks, but at least the architecture is in place, and we can integrate SP invocation with general SQL-handling.
And, by the way, we will have a metadata repository of stored procedures, all served up and ready for use in event-driven architectures.
What do you think?