An introduction to OpenDocument Format (ODF)

In this introduction we answer a range of questions that you might have when you start to consider a move to OpenDocument Format. If you have no time to read through all of a section, or ifit is too complex for what you need, you can skip to the parts that are more important for you.

From this section you will see that ODF has the potential to simplify basic interoperability, making documents available to everyone, everywhere, on every device.

What is ODF and why do we need it?

If you are not a technical specialist, you may not have come across ODF before—although it has been around as an International Standard since 2005. Then again, maybe you have already used the technology it describes, but have just not noticed.

ODF is not a spectacular software programme, a fast moving company or a hyped online service. The three letters ODF stand for "OpenDocument Format", which pretty much covers what it intends to be: a universally shared specification that describes how productivity solutions can reliably store information in a file. Its goal is to make information portable and interoperable with many more tools. ODF is a standard that was created to help people like yourself use the office tools that are most suitable for them – under any circumstance, not just behind the desk you may be sitting behind now.

The adoption of standards like ODF by the UK government and other governments around the world reflects the modern reality of IT users. A large variety of devices, platforms, software tools and services has become available to create, collaborate and interact.

Not every solution is suitable for everyone though. Each individual may have specific needs and reasonable requirements for customisations to software and hardware. You cannot give a sight impaired employee a random tablet app to do their work at the service desk of a supplier, or an elderly citizen with a severe tremble a web tool that needs specific usability settings.

While a private law firm or a top consulting company has different needs and financial means to outfit its employees with IT tools, other organisations, such as a local court or a civil society group concerned about the environment, may need to be more frugal when it comes to IT spending.

We need all of those solutions to work together seamlessly, because the output from one organisation is the input of the next. That is where ODF comes in, to pave the way for interoperability that will allow everyone to choose the tool that best meets their needs.

What is ODF used for?

OpenDocument Format specifies a file format for office applications, suitable for documents containing text, spreadsheets, charts, presentations, formulas and composite drawings. ODF provides a globally recognised standard that enables users to exchange information across multiple tools.

A particular strength of OpenDocument Format is that it consolidates different types of office documents into a single file format. ODF provides a basis for storing the content of text documents, spreadsheets, charts, and graphical documents such as drawings and presentations. It also provides other useful features, such as a track changes mechanism (which is currently being extended to also cater for real-time collaborative editing), the ability to add digital signatures to a document and a very powerful generic metadata system.

Using metadata, the regular content of a document can be enriched by complementary layers of information (created automatically or added manually). Examples are provenance data, copyright information and other forms of Linked Data that will help users of a document to discover additional relevant information.

Whilst many users won't notice a difference when they use files that are saved in OpenDocument Format, under the hood, or rather inside the code for each of the files, everything is completely different. Each type of object in the file has its own place, and there is a neat index of every subfile you can find inside. ODF keeps the document's content and layout information separate such that they can be processed independently of each other. The internal structure is completely modernised and transparent.

It looks a lot like the inside of a webpage, and actually it is very similar. ODF unlocks powerful new capabilities for office applications that were not envisioned in earlier file formats, such as embedded linked data. The rationale for moving to ODF becomes immediately clear as soon as we realise where we want our documents to go: everywhere.

Why do we need to move from the formats that came before?

Normally there is little reason to look inside the files produced by most of the software we use, unless something breaks.

If you've ever tried to do so, you will have noticed that many legacy office files are comprised of dense binary streams. In other words, what you can see while you type and click during the creation of a document is converted into long incomprehensible rows of zeroes and ones that make absolutely no sense outside of the applications that created them. The content and the wrapper around it are indistinguishable without the application to make sense of it. Changing a single bit from zero to one (or vice versa) could cause the application reading that file to behave in an entirely different way, and there is no easy way to tell.

Understanding files written in legacy formats is a task that is extremely challenging even for experts, including the software developers that have written applications that process them.

Legacy office file formats are as complex as they are because they were created in a time when the developers had to invent their own solutions to get anything done. The flood of business needs in the early days of the office application industry were turned into a spaghetti of software code at an amazing pace. One would need thousands of pages to describe exactly how the resulting files are structured. The result is that converting from one format to another has been complex, expensive and error prone.

To make matters worse the various file formats we've used for our content for the last decades are now all deprecated. By deprecated, we mean that the vendor that originally created these legacy formats for its own products is itself no longer actively updating the formats and has moved away from using them as a default option in the current version of their applications. Think for instance of the various versions of .doc, .xls, .ppt and .wpd. This means that as a technology these formats have reached the end of their life cycle. So we need something to take their place.

If we want software to be able to work together now and in the future, we cannot rely on guess work. We need something less cryptic to capture our social and cultural heritage.

This is where ODF fits in. ODF is the universal, vendor-neutral successor to all of these legacy formats. It has the potential to bring transparency to our office applications, resulting in a healthy and competitive market place.

ODF helps you to work over a range of tools and devices

In the early days, IT departments needed to take very little into account when they were procuring office applications. Documents created in a single office application would only travel electronically to a single brand of printers in the hallway and the print room, and so no external parties were affected by the choices IT departments made.

These days most organisations have to interact with many external parties on a daily basis. Because of that our IT departments have to carefully think about user needs when they tender a contract for a new supplier.

The software IT departments choose now for editable documents has to work with a lot of other software and hardware inside and outside our organisations. We need quality solutions that offer real interoperability to give users a good experience when they work with these documents. The formats we use must cater for a multitude of flexible solutions on many devices from different vendors, that can be used in many circumstances.

In the last couple of years we have gained entire new classes of devices such as touch enabled smartphones and tablets. The move to ODF will enable us to work with our documents in Michiel Leenaars2014-11-04T00:01:51.70Add link later on.

the best tool for any given job.

Versatility of ODF

Humphries2014-11-04T09:26:38.80I've re-ordered this section but not made any additional changes to the paragraphs(other than a couple of typos).

I've put the simpler information at the top and the more advanced XML/Linked Data material below.

ODF is not tied to a single vendor, and should fit any product. It provides a flexible container that can embed pretty much any existing digital object and allow it to be used in an office document. Custom metadata can be added anywhere, to any element within a document.

This is quite similar to how the HTML standard allows you to reference just about anything: a TIFF image file from a thirty year old scanner, a plugin object from a third party supplier (e.g. Adobe Flash) or a custom font used to style a page. HTML does not define any of these but just has to point to the right place.

ODF opens many possibilities for integrating with business processes and internal applications. For instance, do you need to embed a live webpage inside a presentation? Do you want a complete music score inside a text document? Do you have a pressing need to put a zoomable 3D globe inside a spreadsheet to show where your organisation is buying the most products and services? ODF allows you to do all of that.

Of course as a container format that was born for offline use, ODF needs to find an alternative way of dealing with what hypertext does by referring to a link. If you include a digital object you have created yourself into an otherwise basic office document, you would still want your users to be able to viewthat document in any ODF compliant software. An application should not have to understand all the specifics of an object to show your user what it looks like.

To handle this, ODF provides a mechanism for claiming an area within the document to embed multiple alternative renderings of object like this as a fallback. So the content can still be viewed by a conformant application, no matter what. This feature gives ODF the capability of providing an editable document that is portable across applications, while allowing customisation.

The versatility of ODF and its quick adoption in the market place is possible because ODF is itself built on another very common standard called XML (Extensible Markup Language). XML is a well-known web standard for structured data created by the World Wide Web Consortium (W3C) There are quite a few other W3C standards that are reused throughout ODF specifications, such as RDF metadata, Synchronised Multimedia and digital signatures.

This lowers the cost for implementing ODF, while creating an economy of scale and increased robustness. Sharing technological building blocks between the web and office applications is quite logical, because of the many common traits between the two. Quite a few office applications offer online editing environments that are accessed over the internet through a web browser.

More importantly, the shared technology allows users to benefit from the availability of Linked Data and to become part of it. Combining ODF with other XML data means that the structured information inside can be reused or consumed in ODF.

Storing documents for the long term

Beyond the user needs of today, governments also have to look at needs in the more distant future. There are many types of documents that we need to be able to access in the long term, a hundred years from now or more. Examples include birth certificates and legal documents such as permits about property ownership or contracts.

ODF is an advanced and future proof format that can Humphries2014-11-04T09:32:42.81Will there be more detail on this later in the guide ? If so, leave a comment here to link to this later. If not, is there something outside we can link users to if they want to know more about ODF and preservation ?

cater for these needs.

Through proper use of ODF, we can all work with documents more reliably than ever before, not just on fully equipped workstations but also on tablets, smartphones and cloud based laptops – even smart TVs. Access to ODF documents is a certainty even for the long term.

Versions of ODF

ODF is a family of technical specifications originating from an international consortium called OASIS, which is made up of hundreds of large and small software companies, academics, government representatives, civil society representatives and other stakeholders. The current version is ODF 1.3, which was approved as an OASIS Standard in 2021. This is the recommended version to use.

If you work with a vendor that does not yet support ODF 1.3, ask them to do so as soon as possible or plan how you can move to another solution provider. Given the availability of many quality tools that are compliant with ODF 1.3, ranging from quality open source solutions, convenient online services and commercial desktop and mobile applications, you have plenty of choice on every platform. It is even possible to open ODF documents locally with your browser and the help of a little Javascript.

Three older versions of ODF do exist. The first version of the standard was OASIS OpenDocument Format 1.0, also published as ISO/IEC 26300:2006. OASIS OpenDocument Format version 1.1 was published in 2007. And OASIS OpenDocument Format version 1.2 was published in 2011.

ODF is a stable format that handles backwards compatibility well. The structure of the ODF file has changed very little during its lifetime, but each version of the standard adds or standardises some new elements. An older document written in a previous version of ODF should seamlessly open in an application supporting the latest version.

However, if a document that is created in the newer version of the format, is opened in an older application, some elements may not be available as they were not part of ODF at the time. You would do best to no longer use products that support ODF 1.0, ODF 1.1 and ODF 1.2 if you can avoid it. If you do, the risk is that you might create a bad user experience if your documents contain elements that were not yet formally defined in a previous version of the standard.

ODF offers the structure to store what it does not (yet) prescribe. but such unknown elements cannot always be expected to work in another application. A good example of this is spreadsheet formulas (OpenFormula). These officially appeared in ODF version 1.2, and were not defined before.

Applications that support ODF 1.2 or ODF 1.3 will all support the same formulas, while applications that supported ODF 1.1 could choose to write their own choice of spreadsheet formulas, so interoperability can be a problem.

If you need to send someone a document and they open it up in a version of the software that only supports an older version of ODF they should still be able to view the content. But as a general principle we don't recommend saving to an older version of the standard, or going back and forth between older applications that do not yet support ODF 1.3 and newer ones. This will save everybody a lot of work, frustration and confusion.

Saving versus exporting to ODF

Exporting a file in a certain version of ODF is not the same as being able to work natively with the format. Exporting is a temporary shortcut with limited guarantees and sometimes unknown tradeoffs.

The file needs to have all the data application and data logic to keep the document editable, while providing a proper fallback and enough information for the rest of the ODF ecosystem to be able to handle any unknown objects gracefully.

If ODF is the default format of an application, that application has to store all of the logic inside the file—or not at all. If a piece of software provides an option to do a one-way export to ODF, it can leave out some elements when it exports. The user then gets an incomplete version with diminished or even no functionality, basically a non-reusable snapshot. Such loss of information means the author of the document implicitly still depends on another copy of the document in a less open format than ODF.

Incomplete or broken support for ODF 1.3 can cause interoperability issues and frustration for users. If a current application has inadequate support for ODF 1.3, you should plan to move to something that better meets your user need.

Security implications of ODF

The use of ODF can significantly contribute to making organisations less vulnerable to targeted attacks compared to older binary formats, lowering the overall amount of computers infected by viruses, spyware and adware. Legacy office files are among the top three most common attack vectors for targeted attacks on organisations.

In a 2011 report, German researcher Christoph Fischer found that the effectiveness of antivirus applications tested against attacks carried out through legacy text document files was very limited. 3 out of 4 antivirus solutions scored a recognition of 20% of or less.

The structure of a file in OpenDocument Format is written down in a very well defined technical markup language called XML. XML is expressive but can also be defined strictly, meaning that an ODF file has to adhere to a very specific set of rules in order to make a valid document. These machine verifiable constraints are delivered alongside the ODF specification by OASIS, the standards body that maintains the standard.

Legacy file formats (such as .doc, .xls and .wpd) came into being in an era where resilience against cyber attacks was not really a part of any design requirements, as computers would typically be offline. The proprietary nature of these formats means that there is no machine processable set of validation rules to run if you want to check if a document contains something odd.

Given that software used in an attack will typically consists of relative short streams of zeroes and ones, binary documents make it easy to hide active attack software which look just like it. There are lots and lots of elements inside proprietary file formats which are difficult for third parties to understand.

During a presentation at the Sixth ODF Plugfest in Berlin, two researchers from the German Federal Office for Information Security (BSI) listed the following IT security advantages of working with OpenDocument Format:

  • Open discussions about weaknesses in document formats
  • Enabling a deeper analysis of techniques used in attacks
  • Development of custom mechanisms to detect attacks
  • Adapting free software that is used for rendering and processing of document formats to individually specific purposes – also independently from the vendor
  • Prerequisite for software diversity
  • Promotion of a competitive environment for vendors

(source: Thomas Caspers and Oliver Zendel, Current Threats and Open Document Formats, 2011)

Of course ODF itself does not make insecure software safe, it can only make it easier to validate what is exactly going in and coming out. It is up to the individual application and anti-malware measures to actively protect you and the people you communicate with. Frequent use of for instance document macros in an organisation can make the whole organisation more vulnerable to attacks. Organisations have to educate the user to deal responsibly with this type of risk.

Want to know more?

The technical committee that is responsible for producing and maintaining ODF is called the Open Document Format for Office Applications Technical Committee (ODF TC). Current members of the ODF TC include publicly traded companies (such as IBM, Microsoft, Red Hat and Beijing Venustech), small and medium enterprises, a number of open source communities (Multiracio, Collabora Productivity Ltd, The Document Foundation, KDE e.V.), independent industry experts and representatives of ISO/IEC JTC1 SC34. SC34 is a sub-committee of the International Standardization Organization (ISO) and International Electrotechnical Commission (IEC) committee.

OASIS is officially responsible for maintaining the ODF standard, but ODF is also officially published as an International Standard ISO/IEC 26300 by ISO/IEC JTC 1.

Anyone can participate within OASIS directly or through membership of a national standards body.

You can also send comments to a dedicated public mailing list which is maintained by the ODF TC members.

Summary

The ODF format offers many benefits to citizens, companies and also to governments because it is targeted at innovation, stability, flexibility and accessibility.

As a customer, you will want to be clear in your requirements from vendors that you need the latest version of the standard in order to give your users the best experience.

The move to ODF has many advantages. It will increase competition and innovation and ensure long term access, as well as giving users a lot of exciting new capabilities to look forward to.

Of course there are also some drawbacks, especially for organisations that are still locked into products and services that don't support the standard. In those cases, some additional work will be needed to make the transition, but it is worth doing and should pay for itself over time.

Most of the work you will have to do has nothing to do with the standard itself, but rather with reducing hidden dependencies from past choices that need to be dealt with anyway.

Through this guide we want to help you make the transition in the most efficient and future proof way, with as little disruption for users as possible.