As more and more businesses look to create more value from their data, they are turning to options that can help them enhance and augment their information as well as find ways to monetise their data. It’s not an easy path to get there, as many parts need to be in sync, but the infrastructure and resourcing to do this is straight forward and explainable (see image above).
To get started we need to first recognise that companies are all going to be at different levels of tech sophistication. Some will be starting from scratch, whilst others will have existing infrastructure to account for. Whether you’re looking to replace existing systems or utilise existing ones, careful thought needs to be applied to these decisions.
That said, the environment upon which monetising data thrives is one where a clear path of input (e.g. data) flows into sophisticate processing and comes out as products all flows along harmoniously. The technology to do this will change over time (e.g. technology agnostic) but the pattern of infrastructure being laid out below is one which will last well into the future.
For the purposes of this article, we’ll refer to this system as the Data Hub. It’s the place where companies get to build the infrastructure that leads to their success in creating various revenue sources and thus, monetise their data. When we refer to clients below it refers to clients of the company providing the Data Hub and monetising their data.
Data inputs — At the very left of the ecosystem are data inputs. These can be client data collected from transactions with their own clients, information they create or have exclusive access to. 3rd party data refers to insights that they might pay for or collect from various free sources (such as webscraping) and that they can use to create new products.
DATA HUB INFRASTRUCTURE
Data Governance — important for any project, there needs to be oversight about how data is collected, analysed, and shared. It’s important that governance is not set up to stifle innovation but it’s also important to not go too down an innovation pathway only to realise later on that the pathway was wrong.
The ETL aspect refers to Extract, Transform and Load which is a common process that comes as part of any data team where raw insights are cleansed, wrangled and turned into the sorts of data that will be easier to analyse. This might involve removing blank columns or rows of data or ensuring that a state name is consistent throughout the database.
Analytics Workspaces — this is where internal project analysts and even those working outside of those teams can, if they have the skill and know-how, have access to the data through a variety of tools (e.g. SQL, Knime, Alteryx, Power BI, Qlik, Tableau) to interrogate and assess the data in various ways. Some of this may be visual whilst other analysis might be purely tabular. In this section, analysts will assess the viability of certain datasets and potentially run tests on this data before it goes any further down the foodchain.
Research and Product Development — in this section, teams are dedicated to working directly with customers and creating proof of concepts (POCs) with those customers. This should be done in the form of a lean startup approach where various MVPs (minimum viable products) are created and tested quickly before further development takes place.
The research element here keeps an eye on how the products will fit and compare to similar products in the market and how well the market is suited for these products.
Sales/Marketing/Commercialisation — in this section the various forms of selling and marketing the product are created. This might be an online store using Shopify or Wix or other providers for the front end. It might be a direct sales model to the target client group. Whichever way works, the terms by which you’ll sell those data products needs to be determined especially where you’re potentially remixing your data with clients insights for wider usage.
This is where the various products created are monetised and typically these various types will suit the needs of different types of clientele. We’ll explain that after the revenue source descriptions.
Subscriptions — this is where clients may want to access a dataset via APIs or have access to a ready made application either on their desktop or mobile device. The subscriptions to data may be useful for clients who already have the data tools they need internally to manipulate and utilise data. They might have a data science/analytics team and be adept at performing these functions.
Platforms — for clients who want to play around with data but may be restricted in doing so in their business (e.g. head office permissions might take too long from overseas or they have limited spending capacity), this could be the solution for them. In this scenario, data and tools to analyse that are provided to customers potentially via online data science notebooks (e.g. Jupyter notebooks). A good example of this is the Python code that users of Quantopian get to write to test various trading strategies using data that Quantopian provides. Various security elements need to be in place here such as the ability to not take data off the platform.
Partnerships — finally, this last element of revenue source comes when a client wishes to use the elements of the data hub and work with the data hub provider to create new products. In these situations, a joint-venture may take place by which the data hub provider could monetise the situation by taking a revenue share in the new product.
At any point in time, clients of the data hub may decide to move from one option to another. They could begin by using the platform to becoming a subscriber of certain types of data and then decide to partner up. The options are many but at the heart of it all, the data hub provider is the one who showcases and provides the platform on top of which this can happen.