Over the past decade and a half, ONIX for Books standard has become firmly established around the world as the book trade standard for the communication of ‘rich product metadata’ – the type of metadata that is needed to support the trading of books (both printed and digital) in the supply chain, not least for online retailing.
Indeed, international metadata standards are becoming an increasingly essential tool for publishers, distributors, booksellers to access to the global publishing market and to develop new metadata-based services providing added-value information on books and e-books along the supply chain.
What do publishers and their supply chain partners need to know about ONIX in order to implement it effectively and make the most from its adoption? How do they take part in guiding ONIX evolution over time, as the international publishing market changes, in order to bring their requirements as users of the standard into the development process?
As Chair of EDItEUR, the international organisation in charge of ONIX standards maintenance and development, these are some of the key questions that I’ll address in my article.
The origin of ONIX
The first version of ONIX for Books (ONIX in the rest of this article) was released in January 2000. ONIX stands for ONline Information eXchange and it was created in response to the growing importance and value of high-quality metadata to publishers and online booksellers, and the recognition that book trade requirements for metadata are different from those of libraries.
ONIX was originally developed jointly by the Association of American Publishers (AAP) and EDItEUR. It built in part upon earlier work within the EC’s <indecs> project (1998–2000), and upon collaboration with Book Industry Communication (BIC) in the UK and the Book Industry Study Group (BISG) in the US. ONIX development and support is now the responsibility of EDItEUR.
As EDItEUR Executive Director Graham Bell notes “The commercial value of comprehensive and accurate metadata is widely recognised. But the sheer number of new book products each year, and the wide range of metadata used to describe and promote them, places a premium on highly structured and unambiguous data which can be processed automatically. ONIX provides a robust framework for publishers and their supply chain partners to exchange that information, at scale, and with the minimum of ‘friction’. It is deliberately designed to be global in scope, to match our increasingly international book trade, and applicable to both print and e-book sectors.”
After more than 15 years of real-life usage, ONIX has become almost ubiquitous in North America and much of Europe, and is now used increasingly in the Asia-Pacific region too.
ONIX as a formal language
In order to better grasp its potential and to facilitate both the production and the consumption of ONIX messages it is crucial to understand that ONIX is more than just a ‘standard’.
Rather, ONIX should be considered as a full-fledged formal language, specifically aimed at conveying a wide range of high-quality book metadata, via computer-to-computer communication, to very diverse actors with varying needs.
As with any language, there are three aspects of ONIX that must be considered: its syntax, its semantics and its pragmatics.
Syntax can be defined as ‘a systematic statement of the rules governing the grammatical arrangement of words … in a language’. In ONIX, these ‘words’ are the metadata elements about a product which are included in one given ONIX message.
So, ONIX’s XML-based syntax for product metadata messages defines which data elements or fields can be used and in which order, how they can (or must) be nested, their dependencies, etc.
For instance, ONIX includes a definition of a ‘Language’ element, with the following structure:
The first column is the name (or ‘tag’) of the XML element and indentation shows the elements’ dependencies and hierarchy. In this case, the four last elements can only be used if a ‘Language’ element exists, and the XML tags for <LanguageRole>, <LanguageCode> etc have to be nested inside <Language>.
The second column shows the element’s ‘cardinality’: 0…n means that the element is optional and that it can be repeated with no limit, 1 means that one and only one instance of this element must be present, and 0…1 means that the element is optional but that it cannot be repeated.
This example makes it clear that syntax does not deal with ‘meaning’ but with the formal relation between ONIX elements. Although the names of the XML elements in the example might convey a sense of meaning, note that there is a variant of ONIX that uses shorter names (the ‘Short tags’ form of ONIX), in which the previous elements are named ‘language’, ‘b253’, ‘b252’, ‘b251’ and ‘x420’ respectively.
The syntax of ONIX messages is formally defined by several XML schemas.
A more human-readable description is available in the ONIX for Books Product Information Format Specification document, and the ONIX for Books Implementation and Best Practice Guide includes a graphical depiction of the syntax.
Semantics, can be defined as ‘the study of the meanings of words and phrases in language’.
In ONIX, the meaning-carrying particles (the ‘values’ of the XML elements) can be expressed using either free text in any language (as in a title or an author’s name, e.g. Des éclairs or John Doe), controlled text (as in identifiers, dates or map scales, e.g. 20150829) or, in many cases, coded values (e.g. A15). The meaning of these coded values is provided in a set of almost 200 ONIX Codelists specifically designed and maintained for ONIX.
Thus, code A15 in List 17, ‘Contributor role code’, means that whoever is tagged by this code is the ‘Author of the preface’ of the product described in the ONIX message. If code A16 is used instead, this means that the person (or entity) is the ‘Author of the prologue’. There are around 100 such codes in ONIX List 17 alone: all of them are used to specify the role (or roles) of a contributor in the creation of a product.
Understanding the range of acceptable values for each ONIX element and a good knowledge of the structure and content of the Codelists and the meaning that can be conveyed by these codes is crucial for a correct usage of ONIX.
The semantics of ONIX is jointly described in the ONIX for Books Product Information Format Specification and in the ONIX for Books Codelists documents.
Finally, ‘pragmatics studies how the transmission of meaning depends not only on structural and linguistic knowledge of the speaker and listener, but also on the context of the utterance’.
ONIX is a very powerful and flexible language. As such, there are occasions where the same (apparent) meaning can be conveyed in several ways. In these cases it is crucial to follow the ‘best practice’ convention in order to make sure that the message will be correctly understood downstream.
ONIX has been in use for more than fifteen years. During this time a significant amount of best practice rules and usage recommendations have surfaced.
There are several publicly available documents describing product metadata best practices like those published by the BISG in cooperation with BookNet Canada, or EDItEUR’s Implementation and Best Practice Guide. The latter is the official source for ONIX’s pragmatics and it is updated by EDItEUR with each issue of the Codelists, four times a year.
Another interesting source of information about ONIX usage is the collection of Codelists’ changes documents which explain the rationale and correct usage of new codes (and occasionally, minor changes to existing codes) introduced in each issue of the codelists, and publicly available since issue 7 in March 2007.
Lastly, EDItEUR maintains a public e-forum for discussing and commenting the implementation of ONIX for Books.
Any organisation planning to implement ONIX, either as a data producer (e.g. publishers), as a data consumer (e.g. retailers) or both (e.g. metadata aggregators) should start by making sure that their internal data models, databases and metadata updating procedures are aligned with ONIX’s semantics and pragmatics.
Although it might look daunting at first sight, implementing ONIX XML functionality either to produce or to ingest ONIX messages, is not the most difficult part of the process, once the necessary pieces of metadata are available and updateable in ONIX-aligned formats in the organisation’s internal data model.
However, there are cases where the different pieces of metadata required for ONIX messages reside in separate, unconnected, databases or where the values of certain key fields do not map easily unto ONIX codes. In these contexts, changes in the organisation’s data models are required before the XML part of ONIX can be considered. These changes might have a deep impact in the organisation’s workflows and internal procedures, and might require significant manpower.
Also, there are many third-party commercially available applications and services for product data management which support ONIX, ranging from cloud-based services suitable for small independent publishers, up to enterprise-scale solutions for the largest global organisations.
EDItEUR and ONIX
As stated before, ONIX development and support is the responsibility of EDItEUR.
EDItEUR is the international group coordinating development of the standards infrastructure for electronic commerce in the book, e-book and serials sectors. EDItEUR provides its membership with research, standards and guidance in such diverse areas as:
- Bibliographic and product information for the book, e-book and serials sectors
- EDI and other e-commerce transaction standards
- The standards infrastructure for digital publishing
- Rights management and trading
ONIX for Books, which is the subject of this article, is only one of the ONIX family of standards maintained by EDItEUR.
During its lifetime, ONIX for Books has been in a state of constant evolution keeping pace with the drastic changes that have taken place in the Internet and in the book sector – the growth of online bookselling, the commercial market for e-books, and even the recent development of e-book subscription services.
Besides the introduction of new releases of ONIX (the current release is 3.0.2), the changes in the Codelists is a good metric of ONIX’s evolution: from an average of two annual issues of the Codelists in 2005, four yearly revisions have been published since 2012.
The number of total codes in the Codelists has doubled: from around 2000 codes in 2005, to well over 4000 codes in August 2015 reflecting the growing complexity of the requirements for proper rich metadata description of publisher’s products.
For instance, the latest Codelists Issue 30, published in July 2015, introduces 74 new or changed codes. Among these there are specific codes for Chinese regions and school grades, which reflect the adoption of ONIX as the local standard for book metadata by a growing number of countries (ONIX 3.0 has been adopted as a Chinese national standard, GB/T 30330).
EDItEUR’s role and commitment in the maintenance of the ONIX standard is crucial. Also, it is important to understand that the ONIX specifications and associated documents are available and usable free of charge at EDItEUR’s website. There are no licensing charges, royalties or fees payable for the use of the standard.
As stated in its website:
“EDItEUR is a not-for-profit, membership-supported organisation. Members of EDItEUR pay an annual subscription, and in return they have a voice in the governance of the standard and the direction of development, and better access to EDItEUR’s expertise. Membership of EDItEUR is a visible sign of an organisation’s support for an open, standards-based approach to building an effective supply chain, for the benefit of all stakeholders. But membership is not required to make use of ONIX or any other EDItEUR standard.”
Therefore, EDItEUR’s operation depends on its members.
Not only do these members provide the necessary financial support, but they are also the main source of the continuous feedback and suggestions that shape ONIX to the ever changing environment. Some members also facilitate national ONIX user groups, which provide broader feedback, and each national group is represented on an International Steering Committee that helps guide the strategic development of the standard.
Direct member participation, or indirect involvement through the different national user groups, are the best ways to ensure that ONIX continues to address the real needs of the market.
EDItEUR’s members have the unique opportunity to participate first hand in the shaping of ONIX, making sure that the standard does cater to the needs of their specific markets, and they also have early access to, and the possibility to discuss and comment, all changes and additions to the standard.
Several of the organisations of TISP Consortium are also members of EDItEUR:
- AIE – Associazione Italiana Editori
- BOEK – Boek.be
- FBF – Frankfurt Book Fair
- FEP – The Federation of European Publishers
- FGEE – Federación de Gremios de Editores de España
- LBF – Reed Exhibitions Ltd – The London Book Fair
- MVB – MVB Marketing und Verlagsservice des Buchhandels GmbH
Any organisation with an interest in the supply chain of products of the publishing industry and their corresponding metadata interchange models is encouraged to consider the possibility of becoming an active member of EDItEUR.
To learn more about metadata standards for books and e-book publishing, check the programme of the next EDItEUR International Supply Chain Seminar, that will take place at the Frankfurt Book Fair the day before the official start of the Fair.
This year programme has been organised with the contribution of TISP, and it will focus on use cases and strategies on how to use technology to extract the most value from the supply chain.