Home / 
Building a Contact Database

Building a Contact Database


Building a comprehensive contact database is the foundation for a successful lead generation program. You need to follow a multi-step process of creating and maintaining thousands of contacts as data records. To get your messages delivered to the right target audience, it should have relevant companies, people, and their correct contact information. SmartLeads utilizes laser-targeted techniques to create a comprehensive contact database with the relevant and micro-segmented target audiences, an important prerequisite for any successful outreach campaign.

The Process of Building a Contact Database

Building a contact database involves collecting information about the companies that can be your prospective customers, people that are relevant decision-makers on such purchases, and their contact details like email address, LinkedIn ID, and phone number. Company information also includes its industry, revenue details, and the number of employees. As such, this database needs to have all the information required to do proper filtering and market segmentation of companies and people.

Multiple activities need to be done properly when building a contact database:

Market Segments are used to split your target audience into homogenous groups of people where you could use customized for this group messaging outlining relevant for this group value proposition.

Search is the necessary step to find relevant companies and people, but insufficient to maintain the required quality of data in the database.

Import allows a streamlined procedure of bringing data from multiple sources together in one centralized contacts database.

Data Merge enables consolidation of data from multiple sources, making sure complementary data fields from different sources are brought together when merging duplicate data records.

Profiling analyzes available information and produces additional normalized data to enable automated mapping of companies and people with respective company profiles and person profiles.

Segmentation uses profiling data to automatically allocate each data record to a well-defined market segment.

Review provides an important feedback loop with a broader expert team to bring in all available industry knowledge and align with the business strategy.

Validation makes sure that the email contact information is accurate. This reduces the number of undelivered emails and maintains the domain reputation in good standing.

CRM sync enables proper coordination of sales activities, making sure proactive outreach to the whole market is not directed at existing customers, prospects or competitors.

Market Segments

The first question you should ask before building a database is what you are planning to do. Do you want to send one email message to all the people in the contact database? Probably not. You will want to break the people into groups based on similarities between them. People in similar companies and similar roles should receive the same messaging.

Assigning people to market segments is the method to segregate people into groups and have specific messaging directed at each group. Another reason to segment the market would be prioritization. You would want to have a method for selecting the most important market segments where you would reach out first, keeping less opportune market segments for a later time.

To define a market segment, we use an intersection of company profile and person profile.

A company profile is defined by the industry, headcount, geographic location, and other criteria. This is what is also commonly called the ideal customer profile (ICP). Many different parameters that we use give us the flexibility to support different requirements. For example, in some cases, you may want to know when the company was founded, how much funding they have raised, what product they sell, or what technologies they use.

A person profile, which is your buyer persona, is defined as a role of the person in the organization including seniority, function (department), and specialization. The relevant roles will be decision-makers, influencers, or evangelists for your product or service.


The contact database creation starts with a comprehensive search in commercial databases and on social media. Multiple sources should be used so that you can fetch different pieces of information and combine them in your database. You will find good information about people’s backgrounds on LinkedIn, but you need to get their email and mobile phone details from a third-party database. For some specific requirements, you may need to do a Boolean Internet search and combine the results into the common database.


Continuous data enrichment (bringing new additional data) throughout the contact database building process is imperative for achieving good data quality. Therefore, we bring companies and people data from multiple sources and combine these data records in our central contact database. We use data field mapping to match the data fields obtained from the source to corresponding data fields already existing in our contact database.

Data Merge

Combining data from multiple sources is a complex undertaking. You don’t want to have duplicate records, so you need to match records from each new source with records already available in the contact database, ideally using a common identifier. The problem is that each database has its own identifier, and you end up matching people by first name, last name, company name, LinkedIn ID, and/or email ID. These matching criteria will be different depending on the source from where you are bringing the data.

Matching the data records that belong to the same person or same company is only half of the task. You then need to select how to merge the data for matching records - combine data fields available in this new source with the data fields already available in each record in the database. You may want to write new data into a data field only when this information is not available in the current data record, or you may decide to overwrite existing data from a more trusted source. It is impossible to do this sophisticated matching and merging in Excel or Google Sheets, you need a database with a proper data merge engine to do it in a streamlined fashion.


When you search, you typically use specific keywords and certain selection criteria supported by each source (company size, person’s seniority, etc.). You don’t want to make your search criteria too granular because you end up with a small number of contacts in your search results and you will need to conduct multiple searches to build the whole database. Hence, people use general search criteria to not miss out on any relevant data. The flip side of this approach is that you end up with a diverse pool of people that are not segregated into specific market segments you defined earlier.

The other problem is that search is not always accurate – when you search on LinkedIn using a keyword, it may be present in the previous job description of a person anywhere in their professional career. You end up with people in search results that are not relevant for the target market segment.

To solve these problems, you need to have one more step in the process – detailed profiling of companies and people in the database.

Profiling involves similar criteria like for search, but now you want to have very granular means of selection to accurately allocate people to specifically defined roles relevant in that industry or functional space. We use several methods to determine the role of each person.

One is using their titles. But titles are written in many ways, so we use an NLP system that would make sense from the text in each title and normalize it to standard naming convention consisting of seniority, function, specialization and other data types.

The second method is using keywords and sentence analysis using NLP. The filter will check whether certain keywords are present in the text of the person's professional background, in his current job description, previous job description, or the summary of the person’s professional background. The ability to select in which part of the person profile to look for keywords is essential for the success of this method.

For sentences, where we found certain keywords, we use NLP to validate the found keyword is accurate and to validate the actual meaning in which this keyword is used.

To validate that the found keyword is accurate, we need to double-check that there are no more nouns directly following or preceding the found keyword. For example, if someone writes “social media strategy”, this sentence will be found using keyword search “social media”, while the actual keyword here is “social media strategy”. Our NLP discovers these cases and proposes to include or exclude such extended keywords for our filtering purposes.

We also need to confirm that the actual meaning in which this keyword is used in a sentence is exactly what we are looking for. For example, if someone writes “I am using Salesforce” would mean that this is a marketing or salesperson. But when someone writes “I integrated our software with Salesforce” would mean that this person is most likely from an IT or software development background and is not relevant for our search for marketing and salespeople. Both people will appear as a search result when simply using the search keyword “Salesforce” on itself. This type of review when done manually would require hundreds of person-hours. Only sophisticated NLP software like ours, based on our BrainCore cognitive technology can make this meaning determination automatically and within seconds.

After the profiling is done, we would have all the relevant normalized data about the company and person to filter out relevant people and allocate each relevant person to a specific market segment.


Segmentation is a separate procedure where the criteria defined in the company profile and market profile are used to allocate each company data record in the contact database to one market segment with a specific company and person profile. Given that these definitions of market segments may overlap with each other, the trick we use is to determine the sequence in which each person is allocated to these market segments.

Imagine you have defined the whole US market as one market segment and then defined California. When you put California as the priority market segment, all California-based companies will be allocated to it first. They will not be added to the US market segment since they have already been allocated to the higher priority one. Using this approach, you can slice and dice the whole market into segments and micro-segments. You can carve out certain relevant niche areas, but you may not need to apply to the same granularity for less critical or less densely populated regions.


Review is another important step of an iterative process of building a contact database. To properly categorize a large pool of relevant people based on their titles or other custom attributes, a review mechanism is necessary. The team needs to bring their in-depth industry understanding to finetune the outlined above profiling procedure. A comprehensive review of accomplished contacts segmentation produces the needed feedback to better define the profiling criteria. This step helps us make this contact database an even better data source of qualified leads.


A database is good as far as it delivers accurate data. The information in your database must be validated and updated periodically. People are likely to switch between jobs which makes it mandatory to check current employment before reaching out to a certain person. The higher the quality of the data, the more targeted is the outreach.

Another type of information that requires accuracy is the email address. There are tools on the market that allow validation that an email address exists and is active. This validation allows identifying all incorrect email addresses, direct efforts into finding the correct ones, or discovering that this person moved to another organization.

Now before sending an email communication, we perform validation that each email ID is an active email account. Considering that people keep on changing their job roles, their emails need to be verified before executing an outreach sequence.

Skipping this step can have a serious impact on your domain reputation. When you send emails to invalid email addresses, your recipient’s email provider sends you an “undeliverable” notification and spam filters track your bounce rate. Hence, to keep your bounce rate as minimal as possible email validation is crucial. We use several tools for email validation to properly verify each email address before sending outreach communication.

CRM Sync

The CRM sync ensures that the data and status of each contact in our database are updated automatically. It enables automated identification of people who are prospects (having a dialog with the sales team) or are already a customer. It is also important to track competitors. All these companies (prospects, customers, competitors) and people in these companies are then excluded from outreach communication.


An accurate, updated, and clean contact database is necessary to drive your business’s growth. It allows you to nurture leads for your business and put your brand on the top of their minds. At SmartLeads, we continuously improve our processes and automation to offer our clients access to a highly qualified contacts database. Employing the right tools and techniques, we amass data points about potential buyers and build a targeted database of contacts that are a good fit for your business. In a nutshell, with SmartLeads you unlock access to hyper-targeted market segments that have higher chances of conversion.