‘Example: Using a Named Query with PostgreSQL’
In this example, you:
- Use the PostgreSQL database
pgtestdb, userpxfuser1, and PXF JDBC connector server configurationpgsrvcfgthat you created in Example: Reading From and Writing to a PostgreSQL Database. - Create two PostgreSQL tables and insert data into the tables.
- Assign all privileges on the tables to
pxfuser1. - Define a named query that performs a complex SQL statement on the two PostgreSQL tables, and add the query to the
pgsrvcfgJDBC server configuration. - Create a PXF readable external table definition that matches the query result tuple and also specifies read partitioning options.
- Read the query results, making use of PXF column projection and filter pushdown.
Create the PostgreSQL Tables and Assign Permissions
Perform the following procedure to create PostgreSQL tables named customers and orders in the public schema of the database named pgtestdb, and grant the user named pxfuser1 all privileges on these tables:
-
Identify the host name and port of your PostgreSQL server.
-
Connect to the
pgtestdbPostgreSQL database as thepostgresuser. For example, if your PostgreSQL server is running on the default port on the host namedpserver:$ psql -U postgres -h pserver -d pgtestdb -
Create a table named
customersand insert some data into this table:CREATE TABLE customers(id int, name text, city text, state text); INSERT INTO customers VALUES (111, 'Bill', 'Helena', 'MT'); INSERT INTO customers VALUES (222, 'Mary', 'Athens', 'OH'); INSERT INTO customers VALUES (333, 'Tom', 'Denver', 'CO'); INSERT INTO customers VALUES (444, 'Kate', 'Helena', 'MT'); INSERT INTO customers VALUES (555, 'Harry', 'Columbus', 'OH'); INSERT INTO customers VALUES (666, 'Kim', 'Denver', 'CO'); INSERT INTO customers VALUES (777, 'Erik', 'Missoula', 'MT'); INSERT INTO customers VALUES (888, 'Laura', 'Athens', 'OH'); INSERT INTO customers VALUES (999, 'Matt', 'Aurora', 'CO'); -
Create a table named
ordersand insert some data into this table:CREATE TABLE orders(customer_id int, amount int, month int, year int); INSERT INTO orders VALUES (111, 12, 12, 2018); INSERT INTO orders VALUES (222, 234, 11, 2018); INSERT INTO orders VALUES (333, 34, 7, 2018); INSERT INTO orders VALUES (444, 456, 111, 2018); INSERT INTO orders VALUES (555, 56, 11, 2018); INSERT INTO orders VALUES (666, 678, 12, 2018); INSERT INTO orders VALUES (777, 12, 9, 2018); INSERT INTO orders VALUES (888, 120, 10, 2018); INSERT INTO orders VALUES (999, 120, 11, 2018); -
Assign user
pxfuser1all privileges on tablescustomersandorders, and then exit thepsqlsubsystem:GRANT ALL ON customers TO pxfuser1; GRANT ALL ON orders TO pxfuser1; \q
Configure the Named Query
In this procedure you create a named query text file, add it to the pgsrvcfg JDBC server configuration, and synchronize the PXF configuration to the SynxDB cluster.
This procedure will typically be performed by the SynxDB administrator.
-
Log in to the SynxDB coordinator host:
$ ssh gpadmin@<coordinator> -
Navigate to the JDBC server configuration directory
pgsrvcfg. For example:gpadmin@coordinator$ cd $PXF_BASE/servers/pgsrvcfg -
Open a query text file named
pg_order_report.sqlin a text editor and copy/paste the following query into the file:SELECT c.name, c.city, sum(o.amount) AS total, o.month FROM customers c JOIN orders o ON c.id = o.customer_id WHERE c.state = 'CO' GROUP BY c.name, c.city, o.month -
Save the file and exit the editor.
-
Synchronize these changes to the PXF configuration to the SynxDB cluster:
gpadmin@coordinator$ pxf cluster sync
Read the Query Results
Perform the following procedure on your SynxDB cluster to create a PXF external table that references the query file that you created in the previous section, and then reads the query result data:
-
Create the PXF external table specifying the
jdbcprofile. For example:CREATE EXTERNAL TABLE pxf_queryres_frompg(name text, city text, total int, month int) LOCATION ('pxf://query:pg_order_report?PROFILE=jdbc&SERVER=pgsrvcfg&PARTITION_BY=month:int&RANGE=1:13&INTERVAL=3') FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');With this partitioning scheme, PXF will issue 4 queries to the remote SQL database, one query per quarter. Each query will return customer names and the total amount of all of their orders in a given month, aggregated per customer, per month, for each month of the target quarter. SynxDB will then combine the data into a single result set for you when you query the external table.
-
Display all rows of the query result:
SELECT * FROM pxf_queryres_frompg ORDER BY city, total; name | city | total | month ------+--------+-------+------- Matt | Aurora | 120 | 11 Tom | Denver | 34 | 7 Kim | Denver | 678 | 12 (3 rows) -
Use column projection to display the order total per city:
SELECT city, sum(total) FROM pxf_queryres_frompg GROUP BY city; city | sum --------+----- Aurora | 120 Denver | 712 (2 rows)When you run this query, PXF requests and retrieves query results for only the
cityandtotalcolumns, reducing the amount of data sent back to SynxDB. -
Provide additional filters and aggregations to filter the
totalin PostgreSQL:SELECT city, sum(total) FROM pxf_queryres_frompg WHERE total > 100 GROUP BY city; city | sum --------+----- Denver | 678 Aurora | 120 (2 rows)In this example, PXF will add the
WHEREfilter to the subquery. This filter is pushed to and run on the remote database system, reducing the amount of data that PXF sends back to SynxDB. TheGROUP BYaggregation, however, is not pushed to the remote and is performed by SynxDB.
