Course Work

Statistics on wine tasting reviews

As a part of one of my latest courses ‘Applied Statistics – from data to results’, we were asked to find any kind of large data set and apply at least one hypothesis test on it.

My group and I found a data set consisting of 130.000 wine tasting reviews on and went to it. We worked hard on the data analysis and visualisation as a team and actually ended up learning a lot about applying hypothesis testing to real life data and the programming of it, but probably even more about the importance of considerate data visualisation.

We ended up doing a presentation of our findings in front of all course participants (selected projects were asked to present).

In this post, I will summarise our presentation together with an outline of findings and methods, and later on I will connect this post with specific guides to each procedures. In case you’re wondering, the wine bottle word cloud was also generated from our data set.

Introducing the data

The data was retrieved from, and was scraped in November 2017.

The data set consisted of 129.971 rows of data with 14 parameters of various type. There were numbers, Nan-values, or strings describing e.g. price, rating, or wine descriptions.

Next: reduction and sorting of data

A rough discarding of non-descriptive data had to be done. Our main hypotheses revolved around relations with either price or rating of a wine, so we dropped all rows with Nan-values in these columns.

We then extracted the vintage of the wine from the wine description, as that was not a parameter.

After this, we had 120.975 rows of data, and 15 parameters.

Choosing hypotheses

Any sorting in the data set (i.e. choosing reviews for a city, choosing reviews for 1 reviewer, or similar fixation of parameters) meant a serious reduction in the available data. The hypotheses were therefore reviewed a couple of times. We ended up investigating:

  1. If ratings would be normally distributed.
  2. Potential correlations between prices and ratings.
  3. Bias in tasters.
  4. Application of machine learning algorithms to predict either wine ratings or prices.

H1: Normally distributed ratings

To test the normality, we made a histogram, and constructed a reference normal distribution based on the data’s mean and variance.

To actually test the normality of the distribution, we then performed a Kolmogorov-Smirnov test between these two statistics which returned a test-value of 0.9999.. The closer the test-value is to 1, the more normal the data is. The ratings were therefore concluded to be normal.

H2: Price and rating correlations

To test the correlation between prices and ratings in an attempt to see if expensive wines are rated lower or higher based on their price, we plotted the mean price in each point-group and the variance.

While it could seem that there might be a higher-order or exponential correlation in the higher point ranges, it seems that the lower point ranges could consist of a linear correlation.

In the end, we remain on the conclusion that there is some positive correlation between the rating of a wine and its price, but it is non-trivial and can only be fitted with high degree polynomia which is likely to just overfit the data. A closer look at the data, however, showed that this was not the case, so we ended the analysis there.

H3: Taster’s bias

Joe Czerwinski

We drew the reviews from a single reviewer, Joe Czerwinski, and analysed his ratings of German and French wines. To eliminate fluctuations in the ratings due to price, we transformed the measure to one of points per dollar. Then we compared the histograms of these ratings using a Kolmogorov-Smirnov test again.

The Kolmogorov Smirnov test revealed a statistic of \( 3.8 * 10^{-9} \) which means the distributions are not at all alike. In comparison with this, we tested the distributions of all reviews of German and French wines, were the KS-test indicated greater similarity with a test-value of 0.47.

We therefore concluded that Joe was inclined to prefer French wines with a larger point per dollar.

19 reviewers

Using the points-per-price index, we made an overview of 19 different tasters, and their mean ratings (with standard deviation showing their ‘experience’ – at least in the data set)

H4: Machine learning predictions

We applied two machine learning algorithms in order to use as much data as possible to predict either wine ratings based on the other informations, or the price of the wine based on the other informations.

Both methods actually proved to be almost equally good at predicting ‘correctly’ – meaning according to the actual data we withheld and had the algorithms estimate the parameter for (although some distributions might seem closer to the original one, measures of variation showed that they deviate equally).

This goes to show, that the complexity of a data set can weigh heavily on a researcher. Fixating one parameter value could seriously reduce the amount of data you look at – so with 15 parameters, ~121k rows of data might not be that much anyways.

Points prediction

Price prediction


  • cialis online

    Tinge viagra f within discomfort pill but as possible adding in.Then the passages were tinged with a subdued happiness that was even sadder than the rest.It is possible that the partner was satisfied with the quality of sex and the amount of sexual contact before taking the drug, and it will probably take some time to get used to the increased energy and demanding of men.Stable nitric oxide levels could help some men to obtain and maintain an erection.I highly recommend these pills to anyone who wants to return passion to their relationship with a partner.

  • ABILIFY no rx

    I’m commenting to let you be aware of of the awesome discovery my child developed browsing the blog. She came to find a good number of details, including how it is like to have a great coaching character to get a number of people clearly completely grasp a number of advanced issues. You actually did more than her desires. I appreciate you for churning out such effective, dependable, edifying and cool guidance on that topic to Emily.

  • kd shoes

    A lot of thanks for your own labor on this website. My daughter take interest in making time for investigations and it is simple to grasp why. All of us hear all regarding the lively ways you offer very useful suggestions via the website and therefore recommend response from other individuals on this topic while our daughter has always been learning a lot of things. Have fun with the rest of the new year. You’re conducting a tremendous job.

  • curry 8

    My husband and i ended up being very comfortable that Chris managed to complete his homework with the precious recommendations he acquired from your own web site. It’s not at all simplistic just to always be freely giving secrets which usually some people have been selling. We really do know we have the writer to be grateful to for that. These explanations you have made, the simple site menu, the relationships you will help to foster – it’s got mostly superb, and it’s really making our son and us know that that topic is exciting, which is certainly seriously essential. Many thanks for the whole thing!

  • canada goose outlet

    I needed to post you the little bit of observation to thank you so much yet again for your personal pretty guidelines you’ve featured here. It was quite unbelievably generous of people like you in giving extensively all many of us would have made available for an electronic book to end up making some cash for themselves, certainly considering the fact that you might well have tried it if you ever considered necessary. Those thoughts additionally worked to become a great way to be certain that many people have the same passion really like mine to grasp much more in terms of this issue. I think there are some more enjoyable opportunities ahead for folks who discover your site.

  • jordan 12

    I together with my pals were looking through the good helpful tips on your web site then quickly developed a terrible feeling I had not thanked you for those secrets. All of the young men appeared to be certainly very interested to study them and have in effect honestly been loving these things. Many thanks for turning out to be so accommodating as well as for going for this form of awesome information millions of individuals are really desperate to be aware of. My personal honest apologies for not expressing gratitude to earlier.

  • supreme hoodie

    I must express appreciation to you just for bailing me out of this type of situation. As a result of scouting through the online world and seeing ways which were not beneficial, I assumed my life was gone. Being alive without the presence of answers to the difficulties you have solved as a result of your good site is a serious case, as well as those that could have badly damaged my career if I hadn’t come across your site. Your own personal ability and kindness in maneuvering the whole lot was important. I’m not sure what I would have done if I had not come upon such a thing like this. I can at this moment look ahead to my future. Thanks a lot very much for the professional and amazing guide. I will not think twice to recommend your web sites to anybody who requires counselling about this issue.

  • canada goose

    My husband and i got now joyous Chris could carry out his inquiry by way of the ideas he acquired in your web site. It is now and again perplexing to simply possibly be offering tips which often most people have been trying to sell. So we know we need the blog owner to give thanks to for that. All of the explanations you’ve made, the straightforward blog menu, the relationships you can make it easier to foster – it is all sensational, and it’s really making our son in addition to us consider that that situation is exciting, and that’s quite indispensable. Thanks for all the pieces!

  • balenciaga triple s

    I wish to convey my respect for your kindness in support of those individuals that really need assistance with this idea. Your special commitment to passing the solution around ended up being astonishingly helpful and have continually made many people much like me to achieve their dreams. This interesting facts indicates a great deal a person like me and much more to my colleagues. Many thanks; from all of us.

  • curry 6 shoes

    I intended to draft you that little word to help thank you so much over again for your personal spectacular tips you have shared in this case. It’s really strangely open-handed of people like you to grant without restraint all a number of people could possibly have offered for an ebook to end up making some cash for themselves, primarily considering the fact that you could possibly have tried it in case you considered necessary. The concepts also served as the great way to understand that other individuals have similar interest similar to mine to figure out whole lot more on the topic of this matter. I believe there are millions of more enjoyable periods up front for individuals that look into your website.

  • series online

    Hello there. I discovered your web site by way of Google while searching for a related topic, your website got here up. It seems to be great. I have bookmarked it in my google bookmarks to come back then. Carmelia Cameron Ardin


    Neat blog! Is your theme custom made or did you download it from somewhere?
    A theme like yours with a few simple tweeks would really
    make my blog jump out. Please let me know where you got your theme.
    Appreciate it

  • cheap flights

    Woah! I’m really digging the template/theme of this blog.
    It’s simple, yet effective. A lot of times it’s very difficult to get that “perfect balance” between user friendliness and visual appearance.
    I must say you have done a fantastic job with this.
    In addition, the blog loads extremely fast for me on Internet explorer.
    Exceptional Blog!

  • diziler

    Curabitur sit amet mauris. Morbi in dui quis est pulvinar ullamcorper. Nulla facilisi. Integer lacinia sollicitudin massa. Cras metus. Cyndi Arthur Caye

  • Rachel Maag

    Hi, I do think this is an excellent website.
    I stumbledupon it 😉 I may revisit yet again since I bookmarked it.
    Money and freedom is the greatest way to change, may you be rich and continue to guide

  • turkce

    Quisque at eros quis est finibus vehicula vitae eget massa. Morbi tempus quam eget mauris accumsan, non tempor odio auctor. Vestibulum maximus accumsan erat. Ut porttitor faucibus est, vitae fermentum sapien imperdiet vel. Integer fringilla ligula ut nibh blandit mattis. Sed dictum molestie posuere. Donec et odio ullamcorper, viverra erat ut, cursus neque. Vestibulum sit amet ullamcorper erat, in egestas nibh. Yoko Bronson Pazice

  • Andrew Riley

    I have been surfing online more than three hours today, yet I never found
    any interesting article like yours. It is pretty worth enough
    for me. In my opinion, if all site owners and bloggers made good content as you did, the net will be much more
    useful than ever before.

  • takipci satin al

    May I just say what a relief to discover a person that actually understands what they’re talking about over the internet.

    You certainly know how to bring a problem to light and make it important.
    More and more people ought to check this out and understand this side of your
    story. It’s surprising you’re not more popular since you definitely have the gift.

  • erotik

    Hello, its good piece of writing about media print, we all be familiar with media is a fantastic source of information. Emmey Brit Annunciata

  • Edith Fabien

    Good – I should certainly pronounce, impressed with your web site. I had no trouble navigating through all the tabs and related information ended up being truly easy to do to access. I recently found what I hoped for before you know it at all. Quite unusual. Is likely to appreciate it for those who add forums or something, web site theme . a tones way for your client to communicate. Nice task.

Leave a Reply

Your email address will not be published. Required fields are marked *