What could be causing incorrect results in my SQL queries?

Iris72 · January 19, 2025, 11:02am

I’m utilizing Supabase for my database and have crafted SQL queries to update specific table columns. I’ve established pg_cron jobs to automatically run these queries periodically. The application I’m working on is a job application platform that tracks fields such as total_jobs (the total job openings available based on subscription), jobs_applied (the number of applications submitted by a user), and available_jobs (the remaining job applications the user can make). The calculation I expect is that if a user subscribes to a plan offering 500 jobs and has submitted 50 applications, the remaining jobs should be 450, which is derived from the equation available_jobs = total_jobs - jobs_applied. However, I’m encountering issues in production; the available_jobs sometimes displays negative values or equals jobs_applied. For instance, if a user has applied for 27 jobs out of a total of 50, it occasionally shows available_jobs as -27 or 27, but it never accurately reflects the expected result of 23. I suspect there may be inefficiencies in my implementation, and as I’m not an SQL expert, I’m seeking advice on how to resolve these problems. Below is a simplified overview of my SQL structure and cron jobs designed to manage the metrics:

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  username TEXT
);

CREATE TABLE metrics (
  user_id INT REFERENCES users(id),
  total_jobs NUMERIC,
  jobs_applied NUMERIC,
  available_jobs NUMERIC
);

CREATE TABLE applications (
  user_id INT REFERENCES users(id)
);

CREATE TABLE plans (
  plan_id TEXT DEFAULT 'id_AAAAAAAAAAAAAAAAAAAAAAAA',
  user_id INT REFERENCES users(id)
);

INSERT INTO users(username) VALUES ('user1');
INSERT INTO metrics(user_id) VALUES (1);
INSERT INTO applications(user_id) VALUES (1), (1), (1);
INSERT INTO plans(user_id) VALUES (1);

-- Cron job to update jobs applied every hour
SELECT
  cron.schedule (
    '0 * * * *',
    $$
    UPDATE metrics
    SET jobs_applied = (SELECT COUNT(user_id) FROM applications WHERE applications.user_id = metrics.user_id);
    $$
  );

-- Cron job to update total jobs from plans every hour
SELECT
  cron.schedule (
    '0 * * * *',
    $$
    UPDATE metrics
    SET total_jobs = CASE plan_id
      WHEN 'id_AAAAAAAAAAAAAAAAAAAAAAAA' THEN 100
      WHEN 'id_BBBBBBBBBBBBBBBBBBBBBBBB' THEN 150
      END
    FROM plans
    WHERE metrics.user_id = plans.user_id;
    $$
  );

-- Cron job to calculate available jobs
SELECT
  cron.schedule (
    '0 * * * *',
    $$
    UPDATE metrics
    SET available_jobs = total_jobs - jobs_applied;
    $$
  );

Though this setup worked fine during development, I’m now facing these unexpected results in production and would appreciate any insight into improving my implementation.

Nova73 · January 28, 2025, 8:44pm

It is worthwhile to consider the possibility of stale or cached database states when debugging these issues in production. Double-check your database connection configurations and cache settings to ensure there is no delay or stale data that might be used in calculating available_jobs. Additionally, ensure that the cron jobs are not overlapping in execution time, which could cause contention and inconsistencies in the updates. You might also want to add logging to track the execution order and timing of these cron jobs, providing more insights into potential synchronization issues.

Haz_45Write · January 31, 2025, 8:28am

Discrepancies like negative values or mismatched logic in production can often arise from race conditions or transactional issues. Given that you’re relying on cron jobs that run at the same interval, a potential reason could be that changes are occurring in the database while these updates are being computed, leading to inconsistency. Consider using database transactions to ensure atomic updates across metrics updates, so all involved fields are computed and committed in a single operation. Additionally, monitor for any possible data corruption or unexpected data insertions into the applications table that might affect the calculations.

Echo_Vibrant · February 1, 2025, 1:33am

Stupid question, but have you checked if there’s data type mismatch happening in your calculations? Sometimes, SQL operations misbehave due to type conversion, especially when dealing with numeric and integer data types. Verify everything aligns, maybe cast to ensure precision is explicitly given. Also, maybe try a bigger value initially?

Sophia39 · January 30, 2025, 1:30pm

Have you looked into the timing of your pg_cron jobs? Is it possible they’re running at different timings or not completing before the next starts, leading to incorrect data updates? timing issues might cause “incomplete” data visibility. Maybe investigating logs could provide hints. How frequently do you update “applications” record?