• SQL Database Normalization (Third Normal Form|3NF)

    CatheriBurbank Member

    Hello I need some help with normalization I’m having a hard time understanding how to normalize.

    This is an exercise where I am to normalize these tables into third normal form given this info:

    These tables are about bidding on a website similar to any bidding website you can think of. They will keep track of the information below.

    These are assumptions that are made

    A user may sell more than 1 book
    A user may bid on more than 1 book
    A user can only bid once on a book.
    Title
    Edition
    ISBN Number
    BookID (unique)
    Condition
    Initial Offering Price
    Current Bid
    Current Maximum Bid
    Auction Start Date & Time
    Auction End date & Time
    UserID of the Seller
    UserID of the current high bidder
    Indication that the auction is either active or complete

    For each registered user:

    Name,
    UserID(unique)
    Password
    E-mail

    If I could get some help on this, it would be great.

  • BuckRenard Member

    3NF is a concept where you don’t store redundant data in a table – you link rather than storing multiple times.

    So you would have a table for: registered user, books, and bidding.

    The registered user table would have name, userid, password, email
    Books would have: title, edition, isbn, bookid, condition
    Bidding would have: bookid, selling user (validated against registered user table), initial offering bid, auction start date (with time included), action end date (with time included), current bid, bidding user (also validated against registered user table), and have a unique key comprised of bookid & selling user & bidding user.

    Your conditions: a user can sell more than one book: each book added has a unique number therefore the unique key created in bidding allows for a selling user to post more than one book – even he has several copies of the same book to offer

    A user may bid on more than 1 book: same as above – the unique key created is based upon the bookid

    A user can only bid once on a book: your sql command will check for a duplicate key (bookseller bidder) – if it is a duplicate you bounce them out.

    None of the data is stored more than once and can be linked together by simple sql to create a form with validation.

  • Abhey Member

    How did you do that so quick, I need some help to understand the concept of normalization. Do you have any suggestions? I’ve watched videos on YouTube and seemed to get it but then when I try a new problem I have a hard time creating separate tables.

    Also when you’re talking about unique key do you mean primary key?
    This is what I figured would be primary keys:

    Registered user table: UserID(PK)
    Books table: BookID(PK)
    Bidding Table: BookID(PK) — not sure about the other columns, how would you grab data from Registered user table without a relationship in any of the columns?

  • BuckRenard Member

    The primary key for the bidding table could be the combined fields of bookid&seller&bidder or could be just a unique field.

    The bidding table has the seller id and buyer id; so you would just do a ‘join’ to the registered user table using this. You can have as many indexes as you want to help speed the queries.

    A unique key is often made the primary key so that you can avoid another index on your table

    As for suggestions for visualizing, this one was really easy so I just did it on the fly; however, for complicated ones or when my head just isn’t into it, I grab a piece of paper (yep, real paper) and draw several boxes on it.

    From there, I walk-through the process of how the real flow happens and how I would need to process something by hand: so in this example –

    – registration of a user: I need a book to keep track of them

    – book library: a seller would have a book in his hand and walk up to the front desk wanting to sell it; so I want to log the book somehow; but I know the person doing the logging won’t be impressed if they have to write out the same book 50 times if the seller happens to have 50 copies – so I know that a need a repository for this

    – user activity: if a buyer were to come in and put up a paddle in an auction, the auctioneer would need a very fast way to know if the potential buyer has already bid. In a real auction, they would have given you a paddle with a number. So you have the paddle number log, which has the paddle number, and just the buyer number. To record the activity, if you wanted, you would take this log book, and tie it together with the registered bidders, sellers and books. In our electronic world, we can combine the activity with the ‘paddles’ – the only reason the paddle log was kept so small in the paper auction was to allow the bids to be checked against the set of rules.

    So after a quick walk through, I find that it is likely that I will have at least 3 log books – translated, 3 tables. Now there could be more, or detailed changes required depending upon the specific requirements.

    You have basically flow charted the information flow – so you just can’t be that off the mark!

    The professor could take your question and then say “a seller can also sell several books in a group” – make the changes as required. Well, nothing about the information flow has changed now has it? Exactly the same other than I now have to group several books into a bidding lot. Hmmm… do I need a new table? Probably: one containing lot number and book ID. This way I can accomodate a lot having millions of items or just 2. A few tweaks to the bidding table to include the lot number etc …

    The basic structure of the tables however, is not changed.

    You probably have already realized that 3NF is hardly ever adhered to in the real world.

    With the speed of computers and availability of storage, it often is better to have some duplicate info stored to facilitate programming, reducing the number of tables needed to be called (especially when those tables are huge).

  • Adelaid Member

    “The bidding table has the seller id and buyer id; so you would just do a ‘join’ to the registered user table using this. You can have as many indexes as you want to help speed the queries. ”
    Okay so can you do something like this then for example:

    SELECT Bidding.SellingUser AS SellerID
    FROM Bidding
    INNER JOIN Users
    ON Bidding.SellerID = Users.UserID

    I’m just curious because I thought the values inside the ON statement had to be the same or am I wrong about that?

    Thank you for the explanation and the example you gave, it’s starting to make much more sense. Yea, I also had always wondered why doing this doesn’t slow down the process of finding something since you have to look at multiple tables but I guess the computer is quick enough that doing this actually saves time.

  • ShikhaTan Member

    3NT is not really used 100% in the real world – the cost of storage and processing is not like it was back in 1990 when every byte of data mattered!

    The scale of the tables will also dictate the structure at times. If there are 5 billion records sitting in a table that and there are 400 hits per minute, you will definitely see different structures.

    However, back to your specific question:

    In your sql statement, the values inside the ON will be the same!

    => Users = every user, both buyer and seller,
    => so bidding user is just a subset of the users table

Viewing 5 reply threads
  • You must be logged in to reply to this topic.