The publishing industry is looking for new decision-making processes that assist new strategies relying on data coming from disparate sources. A new development paradigm allows companies to serve the internal and external demand of information and enables new interactions with the stakeholders in the publishing value chain: authors, readers, publishers, distributors, wholesalers and booksellers.
Debating about big data is extremely trendy nowadays. Whoever talks about it, gets media attention and raises the interest of the audience. It does not necessarily offer scalable experiences since it’s often about big projects by big companies with very big budgets.
Big data seems to belong to those companies that work in sectors where hundred millions or billions bytes are produced such as telecommunications companies, bank corporations and big retailers. Organizations dimension influences costs and deadlines, with particular reference to the project development and the adoption of platforms and infrastructures.
Actually big data affects methods, procedures and tools that can be adapted to different demands and company dimensions.
As a matter of fact, Data intelligence project on Big Data is a chance for lean companies that don’t have the dimensions of the above mentioned businesses. They can quickly take advantage of the analysis carried out with data and information gathered.
Big Data in book publishing
The most recent technological phenomenon of digital devices’ adoption has totally modified the information scenario where the publishing industry operates:
- the quantity of data (due to the explosion of digital devices that produce data by interacting with contents);
- the quality (with big numbers the data quality and meaning tend to high levels and they diminish the necessity of a complex process of interpretation);
- the frequency at which data are generated (data are continuously produced);
- the format (not only structured data coming from certified sources such as corporate information systems but also unstructured data that don’t allow for traditional analysis methods);
- the speed at which the single data is generated, consumed and discarded (obsolescence).
It is worth noting that in the last years we witnessed the appearance of different ways (or containers if you like) to enjoy contents: paper, e-readers, smartphones, digital TVs, cinema, cars, wearable devices (glasses and watches), just to name a few. The frequency and the speed we switch from one container to the next give birth to the new concept of the digital continuity we are beginning to be fully immersed in.
It is therefore crucial a radical thought on integrated data management.
An author’s work reaches the reader through the publisher, the physical or digital distributor, the wholesaler, the physical or digital bookstore. Every sizeable player in the value chain needs advanced data management systems often based upon different units of measure. For example, it could be necessary to communicate in terms of sell-in or sell-out volumes, cover price, tax-free, discounts and promotions. Furthermore for some channels, like large-scale retailers, there are specific units of measure that are functional to the sales channel rather than to the integration with other systems in the value chain.
A new paradigm
The development of data intelligence projects is still based on methods where the main stages are as in the picture below:
Due to the solutions’ features and the stakeholders involved, the requirements gathering, the initial analysis phase and definition of the specifications entail high project costs and represent the real obstacle to the quick development of business solutions.
Costs, in terms both of money and time, as well as the rapid obsolescence of the technological solutions, make this approach extremely onerous for those companies that are unable to allocate adequate investments, even if they handle a large quantity of data.
Actually, thanks to the new technologies, it’s possible to adopt new agile and scalable methods in a new development paradigm where even the traditional project goal of ‘on time, on budget and on spec’ becomes ‘right rime, right budget and beyond expectation’.
This new development paradigm is based upon innovative methods that leverage tools, platforms and technologies available.
The Spiral approach provides the best reference.
This approach is characterized by very quick iterations of data collection, prototype concept, development based on advanced technologies, prototype test and new iteration planning. The result is a continuous evolution of the solution that broadens information sources.
Experimental results exceed the expectations in terms of time and overall costs. The distribution of the efforts in every stage is substantially different from the traditional approach. If in the sequential approach the value of the requirements gathering and analysis was about 50% of the overall project, the Spiral approach requires a 20% distribution for each stage: requirements analysis, design, development and testing.
Additionally the review stage represents the most important one – about 35% compared to the 5% of the traditional approach – as it is driven by the interactions with the clients in order to refine the requirements and to converge incrementally and quickly to the desired solution.
It’s not by chance that Formula 1 players provide other industries with consulting services thanks to the know-how arisen from the elaboration of huge data flows coming from the cars during the competitions. They are used to define competitive strategies in real-time and to dynamically act on system configurations.
Data Intelligence on Big Data
The book publishing industry needs a dynamic system of data query that can quickly adapt to the requests coming from the evolving market conditions and the unpredictable customers’ behavior.
It is crucial to know the correlation between sales trend during the book launch and statistics about e-readers use, between communication and marketing activities in order to understand the value of commercial or editorial success drivers. Or even analyze the demand elasticity as the price changes for best-sellers, mid-sellers, low-sellers and long-sellers. The production and distribution are important as well: optimizing print levels and the book distribution in order to guarantee the most availability in the different sales channels (large-scale retailers, chain bookstores, indie bookstores); reducing the refunds in the different channels; enabling testing strategies to optimize the editorial proposals online (cover, metadata, pricing); determining the most requested and purchased books pairs.
Some requests seem obvious for experts who apply uncontested models till today.
Yet understanding data, massively originating from different sources, can be difficult to explain, even for those experts.
It would be better to extract immediately knowledge from the data, act coherently and quickly, without attempting complex interpretations. That is what Data Intelligence solutions on Big Data can offer.
The implementation of a Data Intelligence solution on Big Data can exploit a platform called data lake. It is a huge archive ingesting many data (including transactional) in different formats (e.g. records, pdf, word, ppt, excel, e-mail). Those data are elaborated on dynamic platforms that adaptively allocates computing and space capacity. Results are presented with modern tools of data visualization in an intuitive graphic shape for stakeholder’s needs.
With this approach, there are two main benefits: visual data quality and the extension of the life and relevance of data.
Data Intelligence projects are not exclusively Information Technology projects. It certainly is necessary to evaluate where to allocate computing and space capacity. It’s also about establishing which technological platforms to adopt and the business models of the services supplied. But also defining and accessing the skills necessary to handle these new systems and design with new paradigms.
It is about tuning the cost dimension coherently with the complexity of the stakeholders requests. It’s about adapting the costs – being consultancy or infrastructural – to the organizations demand.
In this way the entry barriers to data intelligence lower. A greater number of organizations and users can access these solutions by interacting with the internal stakeholders (publishers, distributors and booksellers) and external ones (authors, agents, readers).
It’s a new job that requires a management culture capable:
- to select the technologies and the methods to tune the investments according to the real need of the book publishing industry;
- to adapt new paradigms for the development of solutions;
- to understand the skills and resources gap;
- to select the talents that can drive the transformation;
- to put new qualified staff in the data elaboration.
These tasks cannot be confined within the IT skills area only. It is a structured process of knowledge extraction, starting from the information and the data. It’s, in synthesis, a question of quality of the human capital involved. A subject that belongs to the strategic responsibility of the top management of any sizeable organization in the publishing industry.
To view Vincenzo Russi’s presentation during the Contec conference, click HERE.