Trouble Accessing Cloud Firestore in a CloudRun Node.js App

My Angular Universal app on CloudRun intermittently cannot access Firestore since a recent production update. Outages occur mainly with a single container instance. Any suggestions for a resolution?

hey, maybe check if unused firestore connections arent sticking around too long. i had a similar issue due to coldstart glitches and adjusting conection timeouts helped. add more debug logs to see if it’s a race during init.

hey, check if your node/firebase lib versions re up2date. i had similer issues, after upping container mem and double checking env config my app steadied. could also be a minor bug in load balancing on that container.

Based on my own experience with Cloud Run and Firestore, intermittent access issues may be linked not only to network or configuration quirks but also to the initialization process of your app. I encountered a similar issue where adjusting the timing of credential loading and Firestore initialization reduced the impact of brief network interruptions. Modifying the startup routine to delay Firestore access until all necessary device information was confirmed available was crucial in my case. Additionally, a comprehensive error handling routine that could log, retry, and reset the connection proved helpful in mitigating these irregular outages.

hey, im reely curioust, maby its a delay in firstr store conection? do u see any warmup logs indicating slower init on that one instance? might be worth checking retrys and timeout checks. what do u think?

In my experience, intermittent connectivity issues with Firestore on Cloud Run were often related to how container instances manage environment configuration and retries rather than a broader Firestore outage. A review of my logs showed that a specific instance was missing certain environment variables that other instances had. Building in stronger retry mechanisms and better error logging helped me identify these discrepancies. I recommend verifying that configurations load uniformly across instances and considering an adjustment to the error handling logic to ensure that temporary network or configuration issues do not lead to prolonged outages.