Workflow Skipping Files Due to Time-Specific Delivery Paths in Courier.

This article explains why your courier may skip files that arrive in time-specific subfolders (e.g., /00/00/ and /12/00/ folders) and provides the recommended solution to ensure all data is successfully ingested.

 

Problem

You are seeing files skipped or ingestion workflows failing to pick up all expected data, even though files are present in the source location.

The issue occurs when your source system delivers files at two different times within the day, typically resulting in files landing in separate hourly subdirectories (e.g., .../yyyy/MM/dd/00/00/ and .../yyyy/MM/dd/12/00/).

Your daily scheduled workflow only ingests files from the earlier subdirectory (00/), and misses the later files (12/). Since the next day’s courier run then looks for the new day's files, the previous day's late files are permanently skipped.

 

Root Cause: File Pattern and Timing Mismatch

The root cause is that a single courier cannot reliably capture files delivered 12 hours apart on a daily schedule.

  1. Limited Window: The original courier's file pattern and execution window are configured to pick up files from a single time period (e.g., 00/00).
  2. Skipping Late Files: The courier executes, pulls the files in the 00/00 folder, and completes. The late-arriving files in the 12/00 folder are not available yet or are missed because the time window closes, leading to the data skip.

Solution: Implement a Two-Courier Strategy

The most reliable solution is to create a second, dedicated courier specifically designed to catch the files from the later 12/ delivery folder and ensure they are processed with a time offset.

Step-by-Step Implementation

This example uses a source that delivers files to the /00/00 and /12/00 subfolders.

1. Create the Second Courier (Targeting Late Files)

You need to copy your original courier and adjust its file pattern:

  • Go to Sources. Find your original courier (e.g., AgencyData Courier), click the ellipses (...), and select Make a copy.
  • Name the Copy: Name the new courier descriptively, such as AgencyData_12.
  • Update the File Pattern: Edit the new courier and change the file pattern to explicitly include the missing 12/ path.
    • Original Pattern Example: yyyy'/'MM'/'dd'/*/*/*.json
    • New Pattern: yyyy'/'MM'/'dd'/12/*/*.json
    • This ensures the new courier only searches the subdirectory where the late files land.

2. Add the New Courier to the Workflow Group (Set Offset)

You must add this new courier to your existing Courier Group with an offset to guarantee it looks back far enough to pick up the missed files from the previous day.

  • Go to Activations and open your Daily Workflow Courier Group.
  • Add the new courier (AgencyData_12).
  • In the settings for this new courier, set the Load Offset to 1 (or 1 day).

Outcome: The primary courier will ingest the early files. The new AgencyData_12 courier will run daily but look back one day to pick up the late files from the 12/ folder, ensuring all data is captured.

This process should be repeated for any other couriers that suffers from the same dual-timing delivery schedule.