Popular Posts

Friday, December 3, 2010

SQL Import Wizard -- Importing .csv and .tab files

This blog is to let you know how to use the SQL import wizard. It is very interesting. The most important thing to remember is that you can change the data length and type of both the file you are importing from and the length of the fields that you want to import to.

When you first enter into the Wizard, go past the Welcome screen and You come to the

Choose Data Source page.

You can choose many types of files to import, including an Access file, Excel file, or from a database or a .net Data Provider Source. If you know the details of you provider source application name, or know the location of the Excel file, or have access to the Access database, following those steps are all pretty easy.

However, if you want to import a comma spaced value (.csv) or a tab spaced value (.tab) file, you must use the "Flat File Source" option. I suggest using a .tab file if you are transferring data that uses any kind of non-numeric data such as a list of companies or clients information. These types of non-numeric data are likely to contain comma's themselves, and will make the import wizard think that some of the data items themselves belong in different rows.

A .tab file is very easily interepreted by the wizard. However, one caution is if you file is known to contain tabs. If it does, you will have the same problem whereever a tab appears in the data. Your columns will be skewed upon import.

If your data itself contains commas and contains tabs together. Your best bet would be to replace the tabs with a single space and use a .tab file. Otherwise, you could use a different delimiter such as a semicolon(;) or a vertical bar (|).

Next Step:

Select you .tab or .csv file source. In the open file dialog box, be sure to select files of all types where it says "Files of type:" at the bottom. This part of the process was not done well, because the file browser has as a default to show you only files of .txt or text files. However, that will not cause your .tab or .csv files to show up unless you saved them as .txt type.


Next, be sure to mark the type of delimiter being used (i.e. a tab for a .tab file and a comma for a .csv file). If you get this wrong you will have another chance to correct this before importing your file.

Important Options:

On the same page that you chose your file, you will see to the left a series of options, including:
General
Columns
Advanced
Preview

These are very important tools. If you learn them, it will save you hours of time. Upon exporting 19 million rows of records, I spent over 20 hours trying to solve import problems because I did not understand what I am about to explain.

General Tab:

This is where you upload the file. Under Locale the default is English (Ireland). That is fine unless the language is in another language or incorporates characters from another language. Consult someone else for that problem.

Code page is where you specify the basic encoding of the page ANSI - Latin I or US ASCII work the best for most files.

Under Format, you should choose "Delimited"for a .csv or .tab file.

Important: If there is a header row with column names in the file, you should set the value where it says "Header Rows to Skip:" to 1, that way you won't be importing the column names as your first row.

Columns Tab:

Specify the Row Delimiter as probably the {CR}{LF} value and the Column delimiter as the tab{t} for .tab files and comma{,} for .csv files. Here you can also select a semicolon or vertical bar{|} as we discussed earlier.

After you have chosen these values, you should click the Reset Columns button to get a preview of how your data will import. With a .csv file, choosing Row Delimiter: {CR}{LF} and Column Delimiter: tab{t} worked perfectly. Check to make sure your data is lined up correctly, and that the number of columns are correct.

Advanced Tab:

Here is the most important part*** under the advanced tab, you can choose the value for OutputColumnWidth. This is a very critical value. If you choose a value that is less than any of the data in that column, your import will fail. For example, if you choose 50, which is the default, and even one piece of data in that column is 51 characters long, then the data will need to be truncated for the import, and the the import will fail.

Solution: Set this value well above the typical data length for the column. 255 is a good number.

Here you can also change the number of columns, and give them the proper names if you didn't include a header row with the column names.

Preview Tab:

Here, you will see a preview of how your data will import just like in the columns tab.

However this has one important function, Data rows to skip. You can use this function to skip any number of rows you don't want to store. If you are having trouble importing a row at a certain position, just insert all the rows before that position by chopping off that portion of your text file (and keeping a backup of the entire file of course), then import all the rows after that position by skipping it.

Also you can use this function to see a preview for what data at a certain row will look like for importing.

Destination Page

Now click next and you must choose a destination SQL location. You should know the location, username and password.

*Select Source Table and Views Page

Click Next and you will have to select the name for the table you want to import to, or you can choose a table to directly insert data into.

I would not suggest ever inserting data into an existing table in a database. The reason is that you don't know how the import will go. Your best option is to import data into a new table, and then use a sql statement like

""INSERT INTO [EXISTING TABLE NAME] (COLUMN1, COLUMN2, COLUMN3, ..., COLUMN10) SELECT
(CORRESPONDINGCOLUMN1,
CORRESPONDINGCOLUMN2,
CORRESPONDINGCOLUMN3, ...,
CORRESPONDINGCOLUMN10)
FROM [NEWTABLE]"

to insert the data into the existing table. It is much faster to insert data with a sql call than with the export wizard. About 100 times faster, literally.

VERY IMPORTANT: Here is another place where detail is key. Click the Edit Mappings... button, and it will show you the data types and length of each column you are going to insert. If you choose to insert into a entirely new table, which is what I suggest, then you will not have trouble about truncating any data. Just make sure the size of each column is as great as the column widths you specified in the Advanced Tab earlier.

Good luck and happy importing!




No comments: