Did y’all start with a batch process and then add streaming later to get to that architecture that was doing both? It seems like if you have a highly scalable streaming pipeline you wouldn’t need any batch processing. Was the queue filling up faster than it could be drained?
If you’re already using GCS, why not also use PubSub for your queue and let Google worry about scaling the backend?