Error in Calcite SQL Join: 'Multiple entries with same key' Message

I’m attempting to run a SQL join operation between two distinct databases, CockroachDB and H2, via Apache Calcite. The query executes properly when conducted directly on each database individually, yet I encounter an error when running it through Calcite. The error states:

Query execution error: while executing SQL "EXPLAIN PLAN FOR SELECT c.customer_name, o.order_id, o.order_date FROM CRDB.customers c JOIN H2DB.orders o ON c.customer_id = o.customer_id": Encountered multiple entries with identical key: primary=JdbcTable {primary} and primary=JdbcTable {primary}

Below is the snippet of my Java code involved in this connection:

// Set up connection to Calcite with case sensitivity enabled
Properties properties = new Properties();
properties.setProperty("lex", Lex.MYSQL.name());
properties.setProperty("caseSensitive", "true");
Connection conn = DriverManager.getConnection("jdbc:calcite:", properties);
CalciteConnection calciteConn = conn.unwrap(CalciteConnection.class);
SchemaPlus schema = calciteConn.getRootSchema();

// Connect to CockroachDB
org.postgresql.ds.PGSimpleDataSource cockroachDataSource = new org.postgresql.ds.PGSimpleDataSource();
cockroachDataSource.setUrl("jdbc:postgresql://localhost:26257/sample");
cockroachDataSource.setUser("root");
cockroachDataSource.setPassword("");

// Connect to H2
org.h2.jdbcx.JdbcDataSource h2DataSource = new org.h2.jdbcx.JdbcDataSource();
h2DataSource.setURL("jdbc:h2:testdata;AUTO_SERVER=TRUE");
h2DataSource.setUser("");
h2DataSource.setPassword("");

// Add database schemas to root
schema.add("CRDB", 
    JdbcSchema.create(schema, "CRDB", cockroachDataSource, 
        "sample",  // catalog
        "public")); // schema

schema.add("H2DB", 
    JdbcSchema.create(schema, "H2DB", h2DataSource,
        "DEFAULT",  // catalog
        "PUBLIC")); // schema

// Construct join SQL query
String joinSQL = "SELECT c.customer_name, o.order_id, o.order_date " +
    "FROM CRDB.customers c " +
    "JOIN H2DB.orders o ON c.customer_id = o.customer_id";

try (Statement stmt = calciteConn.createStatement()) {
    System.out.println("Running query: " + joinSQL);
    // Enable logging for execution plan
    stmt.execute("EXPLAIN PLAN FOR " + joinSQL);
    ResultSet planResultSet = stmt.getResultSet();
    System.out.println("\nExecution Plan:");
    while (planResultSet.next()) {
        System.out.println(planResultSet.getString(1));
    }
    // Run the actual query and retrieve results
    ResultSet resultSet = stmt.executeQuery(joinSQL);
    System.out.println("\nResults:");
    while (resultSet.next()) {
        System.out.printf("Client: %s, Order ID: %s, Date: %s\n",
            resultSet.getString(1),
            resultSet.getString(2),
            resultSet.getString(3));
    }
} catch (SQLException ex) {
    System.err.println("Query execution error: " + ex.getMessage());
    ex.printStackTrace();
}
// Close the connection
conn.close();

Additional Details: Using Calcite Version: 1.38.0, CockroachDB Version: 24.*, H2 Version: 2.3, Java Version: 21.

Efforts Made:

  1. Confirmed the existence of customers and orders tables in their respective databases.
  2. Checked correctness of connection configurations for both databases.
  3. Ran the SQL query on each database independently, and it executed correctly.

Question:

What might be the reason behind receiving the "Multiple entries with same key" error within Calcite, and what steps should I take to fix this issue and run the join successfully across both databases? If needed, please provide corrections for the code as well.

Hey Zack, looks like a weird issue you’re facing! Sometimes schema names in your SQL query might overlap internally in Calcite. You can try prefixing the table names with their full schema details like CRDB.public.customers and H2DB.PUBLIC.orders instead. also, check if table aliases are causing conflicts. let me know if this helps!

It’s possible that the issue arises from the configuration of your JDBC schemas within Calcite. Calcite might be having trouble distinguishing between the two sets of metadata coming from different database sources. You could try ensuring that each JdbcSchema is uniquely identified by specifying a unique ‘schemaName’ and ‘tableName’. It would also be beneficial to check if the automatic identifiers of metadata from both databases are in conflict, such as auto-incremented keys being assigned similarly across schemas. Adjusting how tables are imported using more detailed schema catalog information might also mitigate this issue.