Monthly Archives: September 2014
The popularity of Access seems to go up and down on a regular basis. People tend to use Excel (albeit incorrectly) as a database tool as it is a relatively easy program to use, and most people are familiar with it so it is easy to create lists of information. Unfortunately, that is all it is…a list, not a fully functioning relational database. Excel uses formulas to create “links” which normally means VLOOKUPs galore. This is particularly inefficient as your spreadsheet will refresh all formulas each time you update the worksheet and as your data gets bigger this can take a long time to process.
Access on the other hand is designed for the very purpose of building databases. The downside is that it is not particularly intuitive – you can’t just open it and go unless you know what you are doing. In this series of blogs about Access I will cover;
- Database design & data normalisation
- Creating tables
- Linking tables & relationship types
- Creating queries
- Creating forms
- Creating reports
That’s not a definitive list but it covers the majority of what you will need to get creating and running a database.
First things first…designing your database.
Before you even open your PC or laptop, get a big sheet of paper and design your database there. This may seem like a backwards step but until you can visualise and create a database layout with all the necessary tables and links in your head…draw it on paper first!
But how do you go about designing a database?
First, think about how many tables you will need. Chances are if you are thinking in Excel terms you will create one massive table that contains everything. The biggest problem with this is duplication of data. You should be looking to create tables, where each one contains a unique set of related data, and each line in your table represents a unique record.
Let’s look at an example; I want to build a database to track my customers asking for training. At any customer site I deal with there will be one or more contacts that I deal with, perhaps in different departments, managers and staff etc. Problem is, that if I use a single table, each time I add a new contact I also have to add in all the details about the customer name, address etc. which results in a lot of data duplication. I could just enter the customer name and leave everything else blank, adding only the new contact information, but then how do I apply filters or run any look ups? It’s simply not going to work.
Wherever you see data duplication this is hint to create a separate table. This process is called DATA NORMALISATION. Continue doing this process until you remove all forms of duplication from your lists.
Now you’re thinking…I’ve created a bunch of separate tables, each containing unique data with nothing connecting them. At this point this is true, but this is where you need to create common fields between tables that are related to each other.
Continuing our example, we know that the contacts are related to the customers so we need to create a common field between the two tables. At this point think about which comes first – the customer or the contact? Can you have a contact without having a customer or vice versa? It is possible to know random people, but where did you meet them? At the pub, at work, at the gym? You both need something in common however brief to create the connection.
Bearing this in mind, the customer comes first in our example and contacts can share a customer name in common. We therefore need to add a new field in our contacts table that contains the customer ID. Each customer will have a unique ID reference or PRIMARY KEY field and this is what we will use.
This seems to go against having duplication in a table, but all you are duplicating (assuming there are multiple contacts at a customer site) is the ID number. No address information, or any other details relating to the customer are present in the contacts table. Unlike Excel there are no VLOOKUPs to refresh, there is simply a link between the two tables, which we create using the RELATIONSHIPS screen in our DATABASE TOOLS which I will describe in a later blog.
Once the relationship is created it will produce what is called a one to many relationship where the customer can only appear once in the customer table (the one side of the relationship), and each customer can have multiple contacts (the many side of the relationship).
So our database, in this case using only two tables, should look like this once in Access;
As you continue to add tables, think about how each one is related to other tables in the database. Don’t become over-zealous though with your linking, trying to link every table to every other table. As long as you can access all tables following a “route”, no matter how convoluted, around your database, you can extract pretty much any data from any part of it.
Looking at a possible database design above, you can see that the TOPICS table is not directly linked to INVOICES, but there is one via TRAINING JOBS so I could check the number and value of invoices per topic covered. There is no direct link between EXPENSE TYPE and the CUSTOMER tables but because there is a link via EXPENSES > TRAINING JOBS > CUSTOMERS I can extract a report that shows me what type of expenses I tend to have with each customer and you can see how this works in my blog about building queries.
Now imagine the number of VLOOKUPs or INDEX and MATCH or INDIRECT functions I would need to create to try and connect all this data if it was in Excel!
One more thing you need to consider when thinking about the number of tables you require…sticking with our training database, how can I connect the TOPICS table with the CUSTOMER table? Should I even bother? After all, that’s what the customer wants – me to teach a particular topic or range of topics.
Let’s think about it…it’s possible that any customer could want training in a variety of topics. Bearing in mind that each customer appears only once in the CUSTOMER table giving us a one to many relationship (one customer – many possible topics). How about the topics I teach? Each topic could be requested by several customers. Once again, if we look at the tables, our topics appear only once in the TOPICS table which would give us a one to many relationship with the CUSTOMER table (one topic requested by many customers). But we know this is not possible as both the topics and customer names appear only once in their respective tables. The type of relationship we are trying to create here is a many to many type and this is not possible when linking two tables, each containing a primary key field or list of unique values.
If you come across this type of relationship you will need an extra or intermediate table that sits between the two tables. In reality this is an orders table, or something we refer to as a TRANSACTIONAL table.
If you work with an ERP system, just think about the screens you go into. You will have supplier lists, item masters etc. but where you enter an order on a supplier, it is on a completely different screen or form and therefore a separate table in the database. So this is a very common set up in databases…however big or small.
In the table above we can see that each customer appears several times, and the topics also appear several times, and in one case, Excel appears twice for the same customer. What is unique though, is the order number and that will provide me with a PRIMARY KEY in this table.
So now you have the basic principles of database design, and in particular thinking about the number of tables you might need and how to split the data to create tables containing unique record sets, removing all possible duplication through the process of data normalisation. At first this process is not obvious, but with practice you soon see patterns emerging in your data and from there can quickly identify the tables you need to create in your database.
In my next blog I will show you how to create your tables in Access.
In Looping Part 1 ( http://wp.me/p2EAVc-aW ) I showed you how to write LOOPS using DO UNTIL/WHILE…LOOP. They are relatively simple to put together and easy to test and debug when things go wrong.
In this blog I am going to show you the FOR…NEXT loop.
This type of loop lets you control the number of loops more easily, as well as working out by itself when to stop rather than setting an end point in the DO type loops. They tend to be more efficient and therefore quicker. The downside to them for novices is that the cursor does not physically move from one cell to another, or from one sheet to another as all the movement is done in the background. This means you can’t run the DEBUG tool as normal and watch what it does to see where it goes wrong and you have to rely on the IMMEDIATE window to question the VB EDITOR to find out what is going on.
Creating a loop with a set number of loops;
Let’s say I want to generate random numbers for my weekly lottery ticket for which I only need 6 numbers. I need a loop that only goes round 6 times and shows me a number each time.
First we need a suitable variable to count the loops.
Dim x as Byte
Now for the loop;
For x = 1 to 6 ‘(stop after 6 loops)
Next x (automatically adds 1 to the variable at the end of each loop)
In terms of coding this is very simple. All I need to do now is add my code to generate random numbers and display them in a message box;
For x = 1 to 6
MsgBox “Lucky number ” & WorksheetFunction.Randbetween(1,49),vbInformation, “Lottery Numbers”
As numbers are generated randomly we would need to build into the code something to handle duplicate values but to keep it simple here I’m just generating 6 random numbers without any further checks.
Note that normally the message box code would all be on one line or perhaps on a couple of lines with an underscore (line continuation marker) at the end of line 1.
If you are unsure about how message boxes work refer to my blog Excel VBA Message Boxes ( http://wp.me/p2EAVc-3 )for more information.
So that is one simple application of the FOR…NEXT type loop. How about scrolling through a bunch of cells?
Looping through a range of cells;
Let’s say I have some data and I want to apply a bit of formatting. I could create a DO UNTIL type loop moving from row to row and then column to column, but a FOR…NEXT loop will be far more efficient.
I want to highlight all cells with a value under 100 by making the font red. I need code that will select the entire table, and then check each cell to see if each cell is under 100 and change values to a red font.
Dim c as Range ‘create an object variable to represent each cell in the selected range
Range(“A1”).CurrentRegion.Select ‘(selects all contiguous cells around A1)
For Each c in Selection
If c.Value <100 then ‘(tests if the value in each cell is less than 100)
c.Font.Color = vbRed
Next c ‘(loops through all cells in the selected range until it runs out)
So this is much easier than trying to work out when to go back to the top of a column and then move across one and so on and so forth.
Below is a video showing a For Each…Next loop in action showing how to use the IMMEDIATE WINDOW to find out what the code is doing and where.
Looping through multiple sheets;
I can apply the same logic to any number of sheets in a workbook.
Dim Ws as Worksheet
For Each Ws in ActiveWorkbook.WorkSheets
If WorksheetFunction.Counta(Ws.Cells) = 0 Then
‘(test if all cells are blank/empty)
So how does this work?
First we need an object variable (Ws) to represent any worksheet in the workbook. This is needed because we don’t necessarily know how many worksheets are in the current workbook or what they are called, so we need a way of referencing each sheet without being specific (i.e. hard coding names/numbers). And this is one of the issues you will have if anything goes wrong. The worksheet you are in is the active sheet, but you never move away from it. All the movement from sheet to sheet happens in the background, so in STEP INTO DEBUG mode you can’t see which sheet is being tested at any point while the macro is running. You have to rely on the IMMEDIATE WINDOW. To open the IMMEDIATE WINDOW select it from the VIEW menu or press CTRL + G.
The IMMEDIATE WINDOW lets you question the VB EDITOR. The only downside to this is you have to ask the questions in VBA! So, to find out which sheet is currently being checked the code above you would need to ask;
Remember you cannot ask about the ActiveSheet as it is always the same one – whichever sheet you started in.
Looping through multiple workbooks;
Having seen how to loop through multiple worksheets, this logic can be extended to do the same with multiple workbooks.
Dim Ws as Worksheet, Wb as Workbook
For Each Wb in ActiveWorkbooks
For Each Ws in Wb.Worksheets
If WorksheetFunction.Counta(Ws.Cells) = 0 Then
Now we have one object variable for the worksheets and one for the workbooks. The code will look for any open workbooks, and for each one it finds it will scroll through all the worksheets to see if they are blank or not and then move on to the next open workbook until it runs out of workbooks to check. By using object variables there is no need to hard code names etc. relating to the open workbooks or worksheets. Remember, if you need to check which workbook or worksheet is being checked by the code you will need to do this in the IMMEDIATE WINDOW.
One word of warning: if PERSONAL.XLSB is open it will treat it as an open workbook and test its sheets too. Problem is PERSONAL.XLSB only has one sheet and it is blank as this workbook is never used other than to write code to. If the code tries to delete its one and only worksheet it will crash as workbooks cannot exist without any worksheets. To get around this you need an IF statement before starting the loops to test the name of the current workbook.
So there you have the basics of loops in VBA:
- DO UNTIL/WHILE…LOOP
Both have their uses. DO loops tend to be easier to create and test, but FOR…NEXT loops are more efficient but tend to require object variables to get the most out of them and are a little trickier to test using the IMMEDIATE WINDOW.
It’s now down to you to put these into practice and apply them to whatever project you are working on.
One thing you cannot record as part of an Excel macro is looping i.e. moving from one row/column, worksheet or even workbook to another. You can record copying a formula down a column or similar, but it records it as AUTOFILL and therefore records the cells as absolute references meaning that if you have any variation to the number of rows etc. when you recorded the macro, it won’t work properly. So this is where you need to do LOOPS in VBA.
There are two basic loops you can use;
- Do While/Until…..Loop
- For each…..next
In this first blog I am going to go through the DO WHILE/UNTIL type loop.
Let’s take a simple example, where I want to add a formula in column C adding up values in columns A and B.
The code required to do this would be as follows;
First we need a start point i.e. where do we want to start the loop from? In this example, the first cell is C2.
Cells(2,3).Select (use whichever method you prefer)
Then we need to give an instruction when to stop looping. Here, we want the code to stop running when the cell(s) to the left of the formula cell is/are empty. You would need to decide how to handle this depending on the data you have – are all columns full of data? Are there any blanks in either or both columns? This will determine what you need to put in your DO UNTIL/WHILE command.
Do Until ActiveCell.Offset(0,-1).Value = “” (stop when cell to left is blank)
Do Until Activecell.Offset(0,-2).Value = “” And ActiveCell.Offset(0,-1).Value = “” (stop when both columns A and B are blank)
If you are not familiar with the OFFSET function in VBA, it allows you to reference cells based on their distance in terms of ROWS and then COLUMNS from the selected or active cell.
Positive numbers refer to ROWS down and COLUMNS to the right, whereas negative numbers refer to ROWS up, and COLUMNS to the left of the active cell. If you had cell C5 selected, for example, then;
ActiveCell.Offset(2,-1) would refer to cell B7 – two rows down & 1 column to the left.
The alternative to DO UNTIL is DO WHILE. Works the same way, just a question of logic to tell the code when to stop. If using DO WHILE in our original example the code would be like this;
Do While ActiveCell.Offset(0,-1)<> “” (keep going as long as the cell in the column to the left is not empty)
Personally, I tend to go with DO UNTIL…seems more logical to me, but the choice is yours.
Get into the habit of adding in the word LOOP to make sure you close off the pairing of code words. This will give us the following code so far;
Do Until ActiveCell.Offset(0,-1).Value = “”
One thing you will always need in a LOOP, is movement. You need to add a line that tells the editor which direction to travel in as part of the loop. If you don’t do this, then you stay permanently in the start cell and therefore in a permanent loop. If that happens you can use CTRL + BREAK to stop the loop and show the END/DEBUG window. If it is the LOOP causing the problem the code word LOOP should be highlighted in yellow.
In this example, I need to move down a row each time so I will add the line;
This should be just above the LOOP line. If you go on to do more complex loops, think carefully where the movement comes in. I’ll show you a simple example later.
The final part of the code is to tell the code what to do with each loop. In this example it is to add a formula. Here, I will be using R1C1 cell referencing style. If you don’t know what that is, or how it works check out my blog about R1C1 reference style ( http://wp.me/p2EAVc-4I ). Good to know when it comes to VBA formulas!
My final code will therefore be;
Do Until ActiveCell.Offset(0,-1).Value = “”
ActiveCell.FormulaR1C1 = “=SUM(rc[-2]:rc[-1])”
When you have a loop set up, it’s always worth a look step by step using the DEBUG tools to see the loop in action and to check that it’s doing everything it’s meant to be doing, especially if any variables are concerned.
I mentioned earlier about being careful where you put your movement line in your code. Let’s take a look at a bit of code that removes blank lines from a worksheet;
Do Until ActiveCell. Value = “End”
If ActiveCell.value = “” then
Here, the movement is built into the IF statement, rather than at the end just before the LOOP. This is necessary because when you delete a row, all rows underneath automatically move up to “fill the void”. If you move each time you delete a line, and there are two or more blank rows in succession, the next blank row would move up to your current location, and then the cursor would move down a row, completely missing out testing the row that had just moved up. The video below will show clearly what happens if the movement is just before the loop, and then with the movement in the IF statement.
So there you have some basic code to create a loop. Looping Part 2 will look at the FOR EACH…NEXT type loop which is a quicker and more efficient way of looping, letting you loop through multiple worksheets and workbooks in seconds, but do get used to the DO UNTIL/WHILE loop. It has its uses and is easier to debug and test than the FOR…NEXT type loops as you will see in Part 2 of looping.