Address Matching

Introduction

The fabricext/match API returns a location_id for an address in a given vintage of the Broadband Serviceable Location Fabric (Fabric). The API is unique in how it works. To understand why, consider the following:

  • The match API is not a geocoder. It is an intelligent textual search engine that can scan complete Fabric data sets to identify location data that matches the provided address.
  • The match API is built entirely on the Fabric; only locations within the Fabric can be matched to.
  • The match API is designed to favor returning confident matches. Rather than return a guess, it will return nothing.

Using the Match API

The match API can accept either a full textual address or address components. If provided with a full address string, it will attempt to parse the string into components. For clarity, the parsed components will be returned within the response object.

Example Request and Response

Query Parameters

  • maxresults controls how many locations are returned in the event of multiple matches. It defaults to 1. No more than 10 will ever be returned, but it can be useful to set maxresults to a value greater than one in certain use cases.

Request

curl 'https://api.costquest.com/fabricext/202406/match?text=659+van+meter+st+cincinnati+oh+45202'

Response

{
    "text": "659 van meter st cincinnati oh 45202",
    "parsed": {
        "city": "CINCINNATI",
        "road": "VAN METER ST",
        "state": "OH",
        "postcode": "45202",
        "house_number": "659"
    },
    "locations": [{
        "matchrank": 1,
        "zip": "45202",
        "city": "CINCINNATI",
        "uuid": "1cab8278-d023-49ea-ad06-f90ac6368723",
        "state": "OH",
        "address": "659 VAN METER ST",
        "latitude": 39.10724922,
        "longitude": -84.50261603,
        "similarity": 1,
        "zip_suffix": "1568",
        "location_id": 1317564777
    }],
    "matchtype": "F",
    "alternates": null,
    "ruleversion": 1,
    "vintage": "202406",
    "sourcekey": null,
    "apiversion": 1,
    "matchcount": 1
}

Response Object

Main Object

  • text - Reflects back the search string. If components were provided, this will be a concatenation of the provided components.
  • parsed - Provides the parsed components for a provided text string, or reflects back the provided components.
  • matchtype
    • F - Indicates that the matched locations were found using the full search information provided (text or components).
    • Z - If no locations were found using the full provided data, searches are made using house_number+road+zip if those parsed components are available.
    • X - Occurs when the last search performed returns too many records and there is no confidence in choosing results.
    • N - No matches found.
  • alternates - To facilitate searching, certain components may be changed by the match process. Those modifications will be displayed in this object. For example, a building with a house_number of 200A may also be searched as 200 or a road with the value of FRANK RD (REAR BUILDING) may be searched as FRANK RD. When alternates are used, they are searched in addition to the provided data and then ranked against the collection of results.
  • vintage - Reflects back the vintage of data requested to be matched against.
  • ruleversion - Over time, the ruleset for matching addresses may change. This reflects the ruleset version the match was run under.
  • apiversion - Describes changes in the match API implementation that are distinct from those made in matching rules.
  • matchcount - Indicates the number of matches returned. This will reflect the actual number and not one controlled by maxresults. Increased numbers of matches may indicate ambiguity in the strength of the over all match.
  • sourcekey - Returns a provided unique ID to facilitate joining result data to source data.

Locations Array

The locations array contains the actual Fabric location information for any matches. These will always be provided in rank order with the strongest matches appearing first. In the case of multiple location matches, consider the implications for your use case.

  • matchrank - A system assigned ascending value to indicate the quality of the result when compared to the requested search. This is based on the similarity property.
  • similarity - Returns a value between 0 and 1 that indicates the similarity of the provided address information and the matched Fabric location’s primary address. Higher values indicate more confidence.
  • Fabric and Address Information - These properties reflect data from the Fabric including address, city, state, zip, zip_suffix, latitude, and longitude.
  • location_id - The matched Broadband Serviceable Location ID. These are stable over time.
  • uuid - A secondary unique ID used by CostQuest systems such as APIs. These are not stable over time and are unique within a given vintage.

Bulk Matching

For those with very large numbers of addresses to match, CostQuest provides a commercial bulk address matching service which employs additional processing techniques that can only be performed when all source addresses are known in aggregate. The custom bulk service works similarly to the match API but results can vary in some circumstances due to their implementations. For more information on bulk address matching contact sales@costquest.com.

The Match API supports batching up to 10 requests at once into a single POST request. This provides a method to increase matching throughput and reduce the total time taken to process larger numbers of records. For processes that will match large numbers of addresses using the API, take care to create an implementation that honors the CostQuest API rate limiting structure assigned by account tiers. In addition, make sure the account and/or API key has adequate quota as matching large amounts of data can quickly consume account credits. Consider immediately saving results from the match API as they are returned to prevent having to make identical calls and use up credits unnecessarily.

Request

[{
    "sourcekey": "1",
    "text": "659 van meter st cincinnati ohio"
}, {
    "sourcekey": "2",
    "house_number": "1430",
    "road": "e mcmillan st",
    "city": "cincinnati"
}]

Response

[{
    "text": "659 van meter st cincinnati ohio",
    "parsed": {
        "city": "CINCINNATI",
        "road": "VAN METER ST",
        "state": "OHIO",
        "house_number": "659"
    },
    "locations": [{
        "matchrank": 1,
        "zip": "45202",
        "city": "CINCINNATI",
        "uuid": "1cab8278-d023-49ea-ad06-f90ac6368723",
        "state": "OH",
        "address": "659 VAN METER ST",
        "latitude": 39.10724922,
        "longitude": -84.50261603,
        "similarity": 0.6196581323941549,
        "zip_suffix": "1568",
        "location_id": 1317564777
    }],
    "matchtype": "F",
    "alternates": null,
    "ruleversion": 1,
    "vintage": "202406",
    "sourcekey": "1",
    "apiversion": 1,
    "matchcount": 1
}, {
    "text": "1430 e mcmillan st cincinnati",
    "parsed": {
        "city": "CINCINNATI",
        "road": "E MCMILLAN ST",
        "house_number": "1430"
    },
    "locations": [{
        "matchrank": 1,
        "zip": "45206",
        "city": "CINCINNATI",
        "uuid": "56558da0-184a-40ce-821f-26a684437dd2",
        "state": "OH",
        "address": "1430 E MCMILLAN ST",
        "latitude": 39.1252637,
        "longitude": -84.4787451,
        "similarity": 0.5824099794814461,
        "zip_suffix": "2225",
        "location_id": 1399258011
    }],
    "matchtype": "F",
    "alternates": {
        "road": "MCMILLAN ST"
    },
    "ruleversion": 1,
    "vintage": "202406",
    "sourcekey": "2",
    "apiversion": 1,
    "matchcount": 1
}]

FAQs

  • Should I provide a full address or components?
    • It depends. The parser the API employs is powerful, but may not be 100% accurate in certain cases, especially if strings are dirty such as including notes or non-standard representations.
    • If you are confident in your parsed components, such as in cases where they have been verified, validated, or standardized by CASS-certified software, doing component-based searches can provide a slight boost in return quality.
    • If you are unsure how to proceed, pass in a full search string and let the system perform parsing.
  • Why is the API not honoring zip4 values in searches? Why is it not a valid component for searching?
    • Zip4 is not a supported component. It should not be provided, and if it is, the system will attempt to strip it out.
  • I want to match millions of records. What are my options?
    • Consider contacting CostQuest to have your address data set bulk matched.
    • If you would like to use the match API, carefully consider your account tier and overage fees.
  • Is there reverse matching support? I have actual locations, not addresses.
    • The locate API returns locations for arbitrary coordinates using the context of the underlying Fabric data layers. The fabric/bulk API can then be used to identify the address of the locations.
  • The API is not finding my address. Why is that?
    • See Fabric Addressing Considerations below.

Fabric Addressing Considerations

The Fabric models a view of locations that should receive mass-market fixed broadband services consistent with FCC definitions and orders. The locations provided within the Fabric should closely match the physical layout of structures in the real world, but they may not exactly match all structures and other data source attributes assigned to each location.

Because data is pulled from multiple sources to create a location’s attributes, differences can occur when comparing attributes between other data sources or existing address lists.

  • A location that needs fixed broadband service may not have an address. For example, there are many non-addressed structures in rural and tribal areas.
  • A location that needs fixed broadband service may represent a collection of addresses. This could be a high-rise building or a commercial complex where multiple addresses represent support buildings or buildings on a campus. By design, the Fabric may fold this collection of buildings into a single location_id.
  • There are addresses that represent vacant land where no Fabric location is present. This may be an addressed parcel that is used for agriculture. It could also be vacant or otherwise waiting for development.
  • There are addresses that represent structures where no Fabric location is present. This could be utility infrastructure. It could also be an addressed lot with a storage building on it.
  • Addresses exist within land areas represented by entity boundaries. The Fabric will represent an entity boundary - a college campus, military reservation, or prison - with a single location_id.