In this blog I would like to share some of my interesting findings in Database with Full-Text Search using C#.NET (You can substitute it with any technology that interacts with a database).
The most commonly used indexes in an SQL Server database are clustered and non-clustered indexes that are organized in a B-tree structure. You can create these types of indexes on most columns in a table or a view, except those columns configured with large object (LOB) data types, such as text and varchar(max). Although this limitation will not necessarily cause a problem, there will be times when you need to query such column types. However, without indexes defined on the columns, the query engine must perform a full table scan to locate the required data. Full-Text search comes in here.
Full-text search means that functionality in an SQL Server that supports full-text queries against character-based data. These types of queries can include words and phrases and even multiple forms of a word or phrase. To support these queries, Full-Text indexes must be implemented on the columns referenced in the query. The columns can be configured with character data types (such as char and varchar) and binary data types (such as varbinary and image).
Full-Text Searching (using the CONTAINS) is a faster and more efficient way than using LIKE with wildcard characters.
LIKE with wildcard characters (IE:
LIKE '%Search') cannot use an index (assuming one exists for the column), guaranteeing a table scan.
Let’s see how to implement full-text indexing in your SQL Server 2005 or 2008 database.
Steps to create Full-Text Search Index
Under selected DataBase -> Storage -> Full-Text Catalogs
Right click on Full-Text catalogs and click new full-text catalog
Name the catalog and press OK.
Now right click on the table column on which you want to apply the Full-Text Index.
Click on Add
Select Full-Text Catalog Name that you created.
Click on Columns, to assign the column on which you want to apply Full-Text Index.
Generally ‘Noise’ words (such as "a," "and," "is," and "the") are excluded from search. For example, in the English language, the mentioned words are left out from Full-Text index since they are useless in search. But you can include them if needed.
Functions and Examples
FREETEXT( ) is a base used to search columns containing character-based data types. It will not match the exact word, but the meaning of the words in the search condition. When FREETEXT is used, the Full-Text query engine internally performs the following actions on the free-text_string, assigns each term a weight, and then finds the matches.
- Separates the string into individual words based on word boundaries (word-breaking).
- Generates inflectional forms of the words (stemming).
- Identifies a list of expansions or replacements for the terms based on matches in the thesaurus.
CONTAINS( ) is similar to the Free-Text but with one significant difference. It takes one keyword to match with the records. If you want to combine other words in the search, then you need to provide the “and” or “or” in the search, else it will show an error.
Some FreeText Examples
SELECT * FROM content WHERE freetext(*,"home")
This queries all full-text-enabled columns in the content table for the string "home."
SELECT * FROM content WHERE freetext(description,"Magazine")
This only searches the description column and returns all matches for the string "Magazine."
SELECT * FROM content WHERE freetext(description,"SQL Mag")
Although this appears to search on the string "SQL Mag," it actually searches on "SQL" or "Mag."
Some Contains Examples
SELECT * FROM content WHERE contains(*,"home")
Like the Free-Text query, this searches all Full-Text-enabled columns for the keyword "home."
SELECT * FROM content WHERE contains(Description,' "Magaz*" ')
This statement queries the Description column for a word beginning with "Magaz." Note that the asterisk acts as a wildcard or placeholder, just as the percent sign (%) does with the LIKE keyword. (To make this work, you need to use single quotes on either side of the double quotes.)
SELECT * FROM content WHERE contains(Description,' "*azine" ')
This search yields no results. You can't use an asterisk as a placeholder for a prefix.
SELECT * FROM content WHERE contains(Description,' "Magazine" Or "Great" ')
This full-text scan uses OR so that you can search for "Magazine" or "Great"; it also works with AND and AND NOT. (Again, note the single quotes around the search criteria.)
SELECT * FROM content WHERE CONTAINS(description, 'NT NEAR great')
This search on the Description column finds all rows where "NT" is near "great".
SELECT * FROM content WHERE contains(description, ' formsof (inflectional, great) ')
This statement returns all results for "great," "greatest," "greater," and so on.
Full-Text search is a great feature that solves a database problem, the searching of textual data columns for specific words , phrases in SQL Server databases, using FREETEXT() and CONTAINS() with “and” or “or” operators which offers a much better performance.