Data & Upload Management
Search Powered by Google
SAYT
Implementation
Search Powered by Google
Search Classic
Beacons
Customer Experience Management
Command Center
Site Management
Relevancy Optimization
Change Logs
Analytics & Reports
Analytics
StoreFront

Search Data File

Data file is a collection of your catalog items available on your online store. This data needs to be converted into a format that allows the search engine to parse it and provide accurate results when a query is entered.

Data sent to the search engine will be loaded into the collection that you specify in the primary configuration file, and will also be used to generate the Search As You Type (SAYT) keywords and navigations.

This article outlines the mandatory knowledge, as well as best practices on how to structure your data so that that your search experience is the best it can be.

Your catalog items data can be curated in the following way:

  1. Use the GroupBy guidelines to create a good data file
  2. Leverage our data enrichment services that leads to improved conversion rates

Supported Data Format: JSON

Line delimited JSON is the highly recommended and preferred format for data upload. The response from the engine is returned in JSON, and all examples in both data input and output will be using JSON.

Please note that the JSON file must match the following format:

  • Each record must fit on exactly one line; new line / carriage return characters must be escaped.
  • Each record must not exceed 32 MB in size.
  • Each record must be a single JSON object, not an array.
  • Every single value must be sent with quotes (including numbers).
  • The keys for each item must start with an alphabetic character and then only include alphanumeric characters and underscores.
json
{
  "id": "1001",
  "title": "Ultra Comfort Flip flops",
  "gender": "Women",
  "categoryId": [
    "230"
  ],
  "brand": "Nike",
  "variants": [
    {
      "sku": "1001G-7-B",
      "shoeSize": "7",
      "width": "B",
      "color": "Green",
      "image": "1001G",
      "onSale": "true",
      "originalPrice": "149.95",
      "finalPrice": "99.95",
      "stores": [
        {
          "storeID": "1",
          "inventory": "15",
          "price": "99.95"
        },
        {
          "storeID": "2",
          "inventory": "5",
          "price": "99.95"
        },
        {
          "storeID": "3",
          "inventory": "1",
          "price": "99.95"
        }
      ]
    },
    {
      "sku": "1001P-7-B",
      "shoeSize": "7",
      "width": "B",
      "color": "Pink",
      "image": "1001P",
      "onSale": "true",
      "originalPrice": "149.95",
      "finalPrice": "99.95",
      "stores": [
        {
          "storeID": "1",
          "inventory": "15",
          "price": "99.95"
        },
        {
          "storeID": "2",
          "inventory": "5",
          "price": "99.95"
        },
        {
          "storeID": "3",
          "inventory": "1",
          "price": "99.95"
        }
      ]
    },
    {
      "sku": "1001G-6.5-B",
      "shoeSize": "6.5",
      "width": "B",
      "color": "Green",
      "image": "1001G",
      "onSale": "true",
      "originalPrice": "149.95",
      "finalPrice": "99.95",
      "stores": [
        {
          "storeID": "1",
          "inventory": "5",
          "price": "99.95"
        },
        {
          "storeID": "2",
          "inventory": "2",
          "price": "99.95"
        },
        {
          "storeID": "3",
          "inventory": "2",
          "price": "99.95"
        }
      ]
    },
    {
      "sku": "1001G-9-B",
      "shoeSize": "9",
      "width": "B",
      "color": "Green",
      "image": "1001G",
      "onSale": "true",
      "originalPrice": "149.99",
      "finalPrice": "99.95",
      "stores": [
        {
          "storeID": "1",
          "inventory": "9",
          "price": "99.95"
        },
        {
          "storeID": "2",
          "inventory": "4",
          "price": "99.95"
        },
        {
          "storeID": "3",
          "inventory": "5",
          "price": "99.95"
        }
      ]
    }
  ]
}

You can see that it contains:

  • The mandatory fields of id and title
  • Metadata, like gender and brand
  • A categoryId field which will allow us to map categories onto the record - this can be an array of values
  • An array of color and size based variants, that itself holds another array of store-level availability

Upload Limits

To protect the system from misuse, there are a number of limits in place on uploads. If any limit is exceeded, the upload will fail. If you need to evaluate whether these limits could impact you, or if you think you need to expand beyond them, please reach out to your Customer Success Director to discuss the next steps. Following are the upload limits and restrictions:

  • 50,000,000 records in a collection
  • 1,000 fields (key names) in a record
  • Field names are restricted to characters in [a-zA-Z0-9-_ ] and have a total of 80 characters or less
  • Maximum depth of nesting for a given record cannot exceed 8 levels. In the following example, level1.level2.level3.nestedField represents the field nestedField at 3 levels deep, while level1.level2.level3.nestedObject.metadata represents another field metadata at 4 levels deep.
json
{
  "level1": {
    "level2": {
      "level3": {
        "nestedField": "value1",
        "nestedObject": [
          {
            "metadata": "value2"
          }
        ]
      }
    }
  }
}

Invalid JSON Record

If any invalid record is found, the response will show the detailed message indicating where the error occurs. Here is an example of JSON data with an invalid record in the first line.

Invalid Example
json
{ "id":"1", "title":"Pet Rock", "price":"0.99",   "priceRange":"Under 1 dollar",    "tag":"toy", "categoryId":"4"
{ "id":"2", "title":"Barbie",   "price":"122.99", "priceRange":"Under 150 dollars", "tag":"toy", "categoryId":"5B"}
Response to Invalid JSON
Error on line 1: Unexpected end-of-input: expected close marker for OBJECT (from
[Source: { "id":"1", "title":"Pet Rock", "price":"0.99", "priceRange":"Under 1 dollar", "tag":"toy", "categoryId":"4"; line: 1, column: 0])
at [Source: { "id":"1", "title":"Pet Rock", "price":"0.99", "priceRange":"Under 1 dollar", "tag":"toy", "categoryId":"4"; line: 1, column: 227]
Deprecated Data Formats:

We support CSV and XML formats, but they may not be compatible with all of our features. We advise you to check with your project lead before using either format to understand the roadmap for our support.

Best Practices

Data can be gzipped before uploading to improve upload speed and reduce the size of data file. The linux gzip utility should be used to compress data. As shown in the example below, gzipped data must have a supported file type extension, followed by the .gz extension.

curl
curl -X POST \
        --form "config=@uploadConfig.txt" \
        --form "data=@fullData.json.gz" \
        "https://{$$.env.customerid}.groupbycloud.com/data/v1/upload/stream"