Data Matching Wizard
Find hidden duplicates in your data — upload a CSV or TSV file and see matching clusters in seconds
The Data Matching Wizard is a browser-based, step-by-step tool that uses Interzoid's AI-powered similarity algorithms to identify and cluster matching records within your data files. Whether you need to deduplicate company names, match individual names across datasets, or find address variations, the wizard walks you through the entire process without writing a single line of code.
The wizard supports 17 languages and can be used from any modern browser. It works with both CSV and TSV files and provides a downloadable match report showing all identified clusters of similar records.
1Prerequisites
Before using the Data Matching Wizard, you will need:
- An Interzoid API Key: Register for an account and obtain your unique API license key. This key authenticates your requests and tracks usage credits.
- A CSV or TSV Data File: Prepare a comma-separated (CSV) or tab-separated (TSV) file containing the records you want to match. The file should have at least one column of data suitable for matching (company names, individual names, or street addresses).
- Available Credits: Each record processed consumes one API credit. Ensure your account has sufficient credits for the number of records in your file.
2Launch the Wizard and Enter Your API Key
Open the Data Matching Wizard in your browser. Before beginning, enter your Interzoid API key in the top-right area of the header bar. Your key will be saved in your browser for future sessions.
- API Key Field: Type or paste your API key into the input field in the header. Click the lock/eye icon to toggle visibility.
- Check Credits: Click the Credits button to verify your current credit balance before starting a job.
- Language Selection: Click the language dropdown in the navigation bar to switch between any of the 17 supported languages. The entire wizard interface will update immediately. You can also set the language via URL parameter:
?lang=frfor French,?lang=jafor Japanese, etc.
Once your API key is entered, click Get Started on the introduction screen to begin the wizard.
3Select a Matching Function
The wizard presents six matching functions. Choose the one that matches your data and use case. Each function card shows a description and the column parameters it requires.
Single-Column Functions
These functions analyze one column of data to find matches:
| Function | Use Case | Column Required |
|---|---|---|
| Company Name Matching | Match variations like "IBM", "I.B.M. Corp", "International Business Machines" | Company Name |
| Individual Name Matching | Match "James Johnston", "Jim Johnston", "J. Johnston" as the same person | Full Name |
| Street Address Matching | Match "400 E Broadway St" with "400 East Broadway Street" | Address |
Combination Functions
These functions use two columns together for higher matching precision:
| Function | Use Case | Columns Required |
|---|---|---|
| Company + Address | Higher precision matching using both company name and street address | Company Name, Address |
| Company + Full Name | Contact deduplication using company and individual name | Company Name, Full Name |
| Address + Full Name | Person-at-address matching using address and individual name | Address, Full Name |
Click on the card for your chosen function, then click Next to proceed.
4Upload Your Data File
Click the upload area to browse for a local CSV or TSV file on your computer. The wizard will:
- Auto-detect the file format based on the file extension (
.csv,.tsv, or.txt). - Upload the file securely to cloud storage so the matching engine can access it.
- Display a preview of the first several rows so you can verify the data and column structure before proceeding.
Once the upload completes and you see the preview, click Next to continue.
5Select Columns and Options
Specify which column numbers from your file correspond to the matching parameters. The column preview table from the previous step is shown here for reference.
- Column Numbers: Enter the 1-based column number for each required field. For example, if company names are in the third column of your file, enter
3. - Combination Functions: For two-column functions, both column numbers must be provided and must be different columns.
Output Options
- Show Similarity Keys: When enabled (default), each output record includes the generated similarity key as the last column. Records with the same key are matches. Disable this if you want clean output with only the original data fields.
- Matches Only: When enabled (default), only records that have at least one other matching record are shown. Disable this to see every record in the file after processing, sorted by similarity key.
Click Next when your column selections and options are configured.
6Review and Run
The final screen shows a summary of all your selections: matching function, file name, format, column assignments, and output options. Review these carefully before proceeding.
Click the green Run Match button to start processing. The wizard will:
- Validate your API key and check that your account has sufficient credits for the job.
- Process each record through the selected matching algorithm using concurrent workers for performance.
- Generate the match report with records sorted and grouped into clusters of matching entries.
A progress indicator is shown while the job runs. Processing time depends on the number of records — most files complete within seconds, while very large files may take a minute or more.
7Interpret the Results
The match report appears in the results panel at the bottom of the screen. Records are organized into clusters — groups of records that the AI has determined to be matches. Each cluster is separated by a blank line for readability.
Example Output
For a company name match on a CSV file with similarity keys enabled:
IBM Corporation,1 New Orchard Rd,Armonk,NY,d477E1d7sG6dja3hDNsk9P
I.B.M. Corp,1 New Orchard Road,Armonk,NY,d477E1d7sG6dja3hDNsk9P
Microsoft Inc.,1 Microsoft Way,Redmond,WA,k8Rp2mNx4wQjL9vB3cYh7T
Microsoft Corporation,One Microsoft Way,Redmond,WA,k8Rp2mNx4wQjL9vB3cYh7T
MSFT Corp,1 Microsoft Way,Redmond,WA,k8Rp2mNx4wQjL9vB3cYh7T
In this example, the first cluster contains two records identified as variations of IBM, and the second cluster contains three records identified as variations of Microsoft. The last column in each row is the similarity key — all records sharing the same key are considered matches.
8Save Your Results
Click the Save Results button above the results panel to download the match report as a file. On supported browsers, a save dialog will appear allowing you to choose the file name and location. On other browsers, the file will download automatically.
The saved file preserves the same CSV or TSV format as your original input, making it easy to import into spreadsheets, databases, or other data processing tools for further analysis.
9API Access for Developers
The Data Matching Wizard is powered by a REST API that can also be called directly from your own applications, scripts, or data pipelines. This is useful for automating matching jobs or integrating matching into larger workflows.
Example API Call
$ curl "https://match.interzoid.com/match?connection=https://your-file-url/data.csv&filetype=csv&function=company-name-only&apikey=YOUR_API_KEY&company_column=3&showkeys=true&matchesonly=true"
API Parameters
| Parameter | Required | Description |
|---|---|---|
connection |
Yes | URL or path to the data file |
filetype |
Yes | csv or tsv |
function |
Yes | One of: company-name-only, fullname-only, address-only, company-and-address, company-and-fullname, address-and-fullname |
apikey |
Yes | Your Interzoid API license key |
company_column |
When applicable | 1-based column number for company name |
fullname_column |
When applicable | 1-based column number for individual name |
address_column |
When applicable | 1-based column number for street address |
showkeys |
No | true (default) or false — append similarity key to output |
matchesonly |
No | true (default) or false — show only matching clusters |
The API returns plain text output identical to what the wizard displays, making it suitable for piping into other tools or storing directly as a file.
The Data Matching Wizard makes it easy to discover hidden duplicates and matching records across your datasets. Whether you use it interactively through the browser-based wizard or programmatically through the API, it delivers clean, actionable match reports that help you improve the quality, consistency, and value of your data assets. If you have any questions or need assistance, don't hesitate to reach out to our support team.