--- title: 'PostgreSQL XML Data Type' page_title: 'PostgreSQL XML Data Type' page_description: 'In this tutorial, you will learn how to use the PostgreSQL XML data type to store XML documents in the database.' prev_url: 'https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-xml-data-type/' ogImage: '' updatedOn: '2024-04-20T03:57:22+00:00' enableTableOfContents: true previousLink: title: 'PostgreSQL enum' slug: 'postgresql-tutorial/postgresql-enum' nextLink: title: 'PostgreSQL BYTEA Data Type' slug: 'postgresql-tutorial/postgresql-bytea-data-type' --- **Summary**: in this tutorial, you will learn how to use the PostgreSQL XML data type to store XML documents in the database. ## Introduction to the PostgreSQL XML data type PostgreSQL supports built\-in `XML` data type that allows you to store XML documents directly within the database. Here’s the syntax for declaring a column with the `XML` type: ```sql column_name XML ``` The XML data type offers the following benefits: - **Type Safety**: PostgreSQL can validate when inserting/updating data, ensuring XML data conforms to XML standards. - **Built\-in XML functions and operators**: PostgreSQL supports many XML functions and operators to manipulate XML data effectively. ## PostgreSQL XML data type example First, [create a table](postgresql-create-table) called `person`: ```sql CREATE TABLE person( id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, info XML ); ``` In this `person` table: - `id` is an [identity column](postgresql-identity-column) that serves as the [primary key](postgresql-primary-key) column of the table. - `info` is a column with the type XML that will store the XML data. Second, [insert a row](postgresql-insert) into the `person` table: ```sql INSERT INTO person (info) VALUES ( XMLPARSE(DOCUMENT ' John Doe 35 San Francisco ') ); ``` In this statement: - `DOCUMENT` indicates that the input string is a complete XML document starting with the XML declaration `` and having the root element `` - `XMLPARSE` function converts the string into an XML document. - The `INSERT` statement inserts the new XML document into the info column of the `persons` table. Third, [insert multiple rows](postgresql-insert-multiple-rows) into the `person` table: ```sql INSERT INTO person (info) VALUES ( XMLPARSE(DOCUMENT ' Jane Doe 30 San Francisco ') ), ( XMLPARSE(DOCUMENT ' John Smith 40 New York ') ), ( XMLPARSE(DOCUMENT ' Alice Johnson 30 Los Angeles ') ); ``` Fourth, retrieve the names of persons from the XML documents using `xpath()` function: ```sql SELECT xpath('/person/name/text()', info) AS name FROM person; ``` Output: ```text name ------------------- {"John Doe"} {"Jane Doe"} {"John Smith"} {"Alice Johnson"} (4 rows) ``` Each row in the result set is an array of XML values representing person names. Since each person has one name, the result array has only one element. Fourth, retrieve person names as text from the XML documents using `xpath()` function: ```sql SELECT (xpath('/person/name/text()', info))[1]::text AS name FROM person; ``` Output: ```text name --------------- John Doe Jane Doe John Smith Alice Johnson (4 rows) ``` How it works. - First, the XPath `'/person/name/text()'` returns the text of the name node of the XML document. It returns an array that includes all matching values. - Second, the `[1]` subscript returns the first element of the array. - Third, the `::text` casts the XML value to the text. Fifth, retrieve the ages of persons: ```sql SELECT (xpath('/person/age/text()', info))[1]::text::integer AS age FROM person; ``` Output: ```text age ----- 35 30 40 30 (4 rows) ``` In this query: - The xpath `/person/age/text()` returns the text of the age nodes as an array of text. - The `[1]` subscript returns the first element of the array. - The `::text` cast the element to the text. - The `::integer` casts the text to an integer. In this example, we cast an XML value to text and text to an integer because we cannot cast an XML value directly to an integer. Sixth, retrieve the name, age, and city from the XML document: ```sql SELECT (xpath('/person/name/text()', info))[1]::text AS name, (xpath('/person/age/text()', info))[1]::text::integer AS age, (xpath('/person/city/text()', info))[1]::text AS city FROM person; ``` Output: ```text name | age | city ---------------+-----+--------------- John Doe | 35 | San Francisco Jane Doe | 30 | San Francisco John Smith | 40 | New York Alice Johnson | 30 | Los Angeles (4 rows) ``` Seventh, find the person with the name “Jane Doe”: ```sql SELECT * FROM person WHERE (xpath('/person/name/text()', info))[1]::text = 'Jane Doe'; ``` Output: ```text id | info ----+------------------------------------ 2 | + | Jane Doe + | 30 + | San Francisco+ | (1 row) ``` ## Creating indexes for XML data If the person table has many rows, finding the person by name will be slow. You can create an expression index for the XML documents to improve the query performance. First, create an [index expression](../postgresql-indexes/postgresql-index-on-expression) that extracts the name of a person as an array of text: ```sql CREATE INDEX person_name ON person USING BTREE (cast(xpath('/person/name', info) as text[])) ; ``` Second, create a function that inserts 1000 rows into the `person` table for testing purposes: ```sql CREATE OR REPLACE FUNCTION generate_persons() RETURNS void AS $$ BEGIN INSERT INTO person (info) SELECT XMLPARSE(DOCUMENT ' ' || 'Person' || generate_series || ' ' || (generate_series % 80 + 18) || ' ' || CASE WHEN generate_series % 3 = 0 THEN 'New York' WHEN generate_series % 3 = 1 THEN 'Los Angeles' ELSE 'San Francisco' END || ' ') FROM generate_series(1, 1000); END; $$ LANGUAGE plpgsql; ``` Third, call the `generate_persons` to insert 1000 rows into the `person` table: ```sql SELECT generate_persons(); ``` Fifth, find a person with the name `Jane Doe`: ```sql EXPLAIN ANALYZE SELECT * FROM person WHERE cast(xpath('/person/name', info) as text[]) = '{Jane Doe}'; ``` Output: ```text QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on person (cost=4.31..17.81 rows=5 width=178) (actual time=0.039..0.040 rows=0 loops=1) Recheck Cond: ((xpath('/person/name'::text, info, '{}'::text[]))::text[] = '{"Jane Doe"}'::text[]) -> Bitmap Index Scan on person_name (cost=0.00..4.31 rows=5 width=0) (actual time=0.036..0.037 rows=0 loops=1) Index Cond: ((xpath('/person/name'::text, info, '{}'::text[]))::text[] = '{"Jane Doe"}'::text[]) Planning Time: 0.144 ms Execution Time: 0.069 ms (6 rows) ``` The output indicates that the query utilizes the index expression of the `person` table. ## Summary - Use the `XML` data type to store XML documents in the database. - Use the `xpath()` function to retrieve a value from XML documents.