Boston Magazine does Metacritic for Boston restaurant reviews
Interesting feature in the new Boston Magazine, which took the novel approach of averaging reviews of Boston's "top" restaurants from multiple sources, including the lead restaurant critics of the major local print dailies, weeklies, and monthlies, a local TV show, Yelp, and Chowhound. (The Chowhound contribution was done by soliciting the input of a few board regulars, myself included): http://www.bostonmagazine.com/restaurants/articles/the_50_best_restaurants/page1
Particularly entertaining is the breakdown of each of the critics (though the sample Chowhound quote used sounds more like something a Yelper would post): http://www.bostonmagazine.com/restaur...
-
Maybe I just missed it, but waht would be nice would be the full listing of their rating from every critic for every place. It'd be interesting to see what (if any) correlations existed between critics.
›11 Replies-
-
-
-
-
-
-
re: MC Slim JB
got it
thanks
I'd like to see how those numbers were weighted especially the length of time of last review (globe/herald/phoe) and what if a place wasn't reviewed by some of the media.I'd like to see all the mathematical criteria published. There are some restaurants on there that I'm not so sure about and I'm not even linking it to advertising. Bin 26 Enoteca?? Avila???
-
re: Wursthof
Hi all-
I have a more detailed reply for you from Elaine Allen, Associate Professor of Statistics & Entrepreneurship at Babson College, who calculated the scores:
"The method I use for doing rankings is designed to highlight the differences rather than the similarities in ratings although the weights applied later do overcome this a bit since Boston Magazine was weighted higher than the newspapers. That said, a review from over a year ago was down-weighted by 10% but more importantly, a globe rating of 3/4 = 75% while 4/4 = 100% and adding the down-weighting by 10% moves that even further down to a 69% compared to the 100% for Rialto, a big difference. The Phoenix rating is from over two years ago so the 100% (4/4) in 2005 becomes an 81% in 2008. Also, the Zagat ratings are not identical, Rendezvous is slightly lower.
Hope this helps. Makes sense to me but you might also mention that the scores of the two restaurants are pretty similar, while putting them 13 steps apart in the ranks, they are in the same restaurant range (scores from 60 to 70). "
-
re: amytraverso
How come the Rialto review wasn't downrated? It's over a year old.
They have identical zagat scores for food.Apparently if a restaurant wasn't reviewed by one of the sources then it is a benefit to them (ex. the phoenix re: rialto) and if they were reviewed and it was over a year or two years then you get severely penalized(ex rendezvous re phoenix). Those that haven't been reviewed in a really long time were penalized to an extreme.
Doesn't sound fair or accurate to me.Still doesn't add up.
No worries about responding any further, I'm done. It is what it is and I still hold my contention that some restaurants were unjustly and inaccurately portrayed by your list.
Jolyon has my email if you wish to correspond further.
-
re: Wursthof
I do appreciate your concerns, and I'd be interested in hearing your specific thoughts on how we might have approached the analysis differently (I'd prefer to chat publicly, since that's where we started, and others might have similar thoughts).
First of all, to answer your question, the Rialto review wasn't downrated because it was less than two years old.
When we set out to analyze the data, fairness was our primary concern. If we had given a ten-year-old review the same weight as a one-year old review, that certainly wouldnt've have been fair, given how often chefs move around and menus change. So we adjusted for age as best we could. It's also why we gave our own rankings more weight -- they were the most recent and consistent.
I'm not sure what you mean by "penalizing" a restaurant for not being reviewed by a particular outlet (or for having an older review, for that matter. The intent wasn't to penalize, but to adjust the weight in the interest of, yes, fairness). If a restaurant got a bad review from the Phoenix, they got a bad review from the Phoenix, and that went into the "average" score. If a restaurant wasn't reviewed by the Phoenix, they didn't lose *or* benefit from that missing review.
Oh, and we averaged the three zagat scores (food, service, decor) because those are the same factors that determine any review score, so Elaine is correct.
-
re: amytraverso
. That said, a review from over a year ago was down-weighted by 10% but more importantly,
The Rialto review is over a year old. How is it still 100?
Re: Rendezvous: How does a 100 become an 81 after two years and not penalize the restaurant?
Now look at the numbers for Radius. It has two reviews almost 10 years old and a globe review over 2.5 yrs old. That didn't seem to hurt them from getting a pretty high 82. Granted they have some higher numbers in other categories but a really poor chowhound one.
Regarding zagat: Rialto didn't get a score for decor(how could that be? the gods must really not like me) and Rendezvous didn't get a great score for decor. Service is different by one point which if you leave out the decor score for average means rialto is 1/2 point higher on a scale of 30, not such a big difference. You included the decor score would leave rendezvous with a couple of points lower score. Did they get penalized for their decor score that badly in your survey. Do you really think Matt Schaefer(? apologies if wrong) bases his reviews on decor??
The review I saw in the archives for radius doesn't have any stars or letters so I don't know where the 4/4 came from unless I have different info.
I don't think it all works out to apples vs apples.
Try a little visual exercise with your spreadsheet. Cover up the the globe review for rialto and the phoenix review for rendezvous. As I have stated before I put Rialto in a higher category personally. Look at your spreadsheet with those numbers covered. What conclusion do you draw?
-
-
-
-
-
-
-
-
-
-
Well done piece which I enjoyed. However last time I checked the outstanding Blue Ginger was not within route 128. The copy editors should be forced to eat at the 50 worst restaurants in Boston whatever they might be.
›9 Replies-
-
re: mats77
Blue Ginger presumably squeaks in on its celebrity chef and national rep; you'll notice the Chowhound score was the lowest of the bunch (I know I'm not a big fan).
L'Andana also made the list and it's outside of 128.
Don't forget BoMag's target audience; I don't know many people who actually live inside of 128 and subscribe to it.
-
-
re: mats77
L'Andana is pretty fine, very much in the Sorrelina mode, but with bigger portions and in a space that looks like it used to be a furniture warehouse from the outside.
I notice that many restaurant try to glom onto neighborhoods they don't quite reside in. On OpenTable, the Copley Place Legal Sea Food lists itself as being in the South End, as do Da Vinci, 33, STIX, and Mistral, though I think most or all belong in Back Bay.
-
-
-
-
-
-
re: Wursthof
I can think of a few reasons why BoMag would not reveal this: 1) it would be a lot of work to lay out the statistical weightings just to satisfy some curious Chowhounds; 2) it's proprietary and they have future plans for it; or 3) revealing the weighting might tell more about the survey than they really want known (e.g., potential holes in their methodology, or how heavily they tilt in favor of their own house critic).
I'm appreciating it for the effort. At best it can only be a snapshot anyway (one has to rely on two-year-old reviews, while chefs change, service and food slip or improve over time, etc.). Chances are they are factoring in someone's opinion you don't care about or trust, anyway.
-
re: MC Slim JB
1. I don't think it would be much more work, they already had the numbers to make the calculations. All they have to do now is release the spreadsheet that includes them. 2. You might have a point but IMO there are a good number of restaurants that do not belong on that list. Selling something flawed isn't good. Revealing the weighting would remove the mystery so curious people like myself can look at the information and formulate accurate opinions(decisions). If they publish an article supposedly based on mathematical theorem(assigned to other's opinions) they should PUBLISH THE INFORMATION TO PROVE THE MATHEMATICS. 3. I don't favor much of Kummer's opinions but do agree with little tidbits of his. I may actually be in more agreement with the other editors(at BoMag) opinions based on the points they assigned to restaurants. I just don't like the fact that they can assign lists with numbers based on math with no back up. They assigned values to other people's opinions and not telling anyone what those opinions were worth. If they had just come out with a list and said these are our top 50 restos I wouldnt have much to say other than okay, whoop dee doo.
How does Hungry Mother make the top ten?
Why was Franklin Café left off the list. A pretty glaring omission for a really good restaurant. Silvertone also omitted??
Orinoco was included in the survey but places like Dok Bua/Angela's/Any chinese restaurant in Chinatown were omitted??
Prose???
Douzo included but Fugakyu not?
Marliave gets a 14.6 rating based on three reviewers out of 8(although I am sure it was too early to get a Zagat rating)The whole thing just smells to me and I was just hoping for some back up from the magazine people to support their math.
Chances are pretty good they won't answer anything that doesn't help their self serving needs.
-
-
-
-
-
Our copy of Boston Mag. arrived yesterday and I was glad but surprised to see that Boston Chowhounds were included in the survey. The quote was silly to say the least..... and stating that we hate the PG was inappropriate, I think...
even tho.›2 Replies -
Thanks for posting this. I hadn't seen it yet. Participating in the survey was fun and gave me a list of many more restaurants I need to get to or return to.
ETA: just read the line about hounds being smurfs. Um, what?
›11 Replies-
-
re: LindaWhit
There looks like there is a lot of room to fudge things there. Were restaurants penalized by not being reviewed by certain media? I mentioned before what the weighted average might be against a review 11 yrs old and one 2 yrs old.
I guess one of the easiest comparisons just based on number to make are the numbers for Rialto against the numbers for Campania. Just for the record I like both restaurants and consider them to be among the top in the area and I give a slight edge to Rialto. If you drop the herald score(by the way how did they compare a B or a 1.5 when they changed the rating scheme?) and look at the rest of the scores other than the globe review then they look closer especially since the BoMag rating is the same and is supposed to be weighted higher. They both have similar zag ratings and identical PG ratings and the glaring difference is the 4 star globe review from 07(rialto) verses a 2 star globe review from 98(campania). If you gave the globe review a .167 multiplier to even things out that would bring the rialto score to a 16.6 for the globe and 8.35 for campania. Give everything else an even weight, add them up and divide by 6 then Campania would have a higher score or 71.89 verses a 66.383 for Rialto. Campania missed the top 50 by about 5 points and Rialto was about 15 points higher than the number 50 restaurant. If you perform the same exercise and include Landana, Il Capriccio, Davio's and EVOO then the rankings would be in order
Campania
Rialto
Davio's
EVOO(I substituted the PHO score for PG, had there been a PG score it prob would have been higher)
Landana
Il CapriccioI'm not sure what I gained by doing this other than to prove I had way too much time on my hands today. I would definitely be interested in finding out how the points were weighted and how penalties for non reviews were used. I think you can pretty much make any numbers look any way you want if you try.
I will say I probably agree with the top tier of the rankings although not identically and I think the middle to end is very skewed. As I mentioned before Avila is in the top 50 and I have no agreement with that whatsoever and according to the rating by the CH reviewers it would seem they would agree.
Does anyone have any input?? I'll just scratch my head some more and wonder.
Another ? If Landana was rated higher than other similar restaurants in west by BoMag staffers then shouldn't they have gotten BoMag best Italian west?
Way too much time on my hands today.
-
re: Wursthof
Apologies, Wursthof. I think the actual reason is probably a combination of MC's #1 (see below) and simply not monitoring the board vigilantly enough to jump with an instant response (!).
Here's what I know:
(1) Beyond sitting down and assigning ratings to each of the 117 restaurants, there was no finagling or subjective influence on the magazine's part. (Though that in itself is clearly a substantial amount of influence.)
(2) We input all the scores from the various critics into an Excel spreadsheet, which we sent to an out-of-house statistician with instructions (a) to count the Boston mag score twice and (b) to weight older reviews less than current ones. Beyond that, the statistician had control.
(3) If a place was not reviewed by, say, the Globe, the restaurant wasn't "penalized," per se, except that the other scores ended up being weighted more heavily.
(4) The statistician did statistician-y things to normalize the scores in a way that would allow her to turn an apples-to-oranges comparison (i.e., a Schaffer B+ and a Phantom 84) into an apples-apples one. The formulas underlying that normalization were all up to her; we had no opportunity to finagle, even if we'd wanted to.
(5) L'Andana received arguably a bigger Best of Boston award last year: Best New Restaurant for all of greater Boston.
(6) Much to the chagrin of the other side of the building (well...technically, downstairs), not too many advertisers ended up making the list.
Hope some of this helps. I'm glad there's continued interest in our little experiment.
-
re: Jolyon Helterman
Jolyon,
Thank you for the feedback and at least humoring me a little. I, for one, have lots of interest in this experiment. I like this numbers stuff and what baseball GM's have done with some of the stuff. Maybe it has a place here in the food world.
Did anyone verify the statistician's work? Was the actuary unbiased? Was the actuary given a blind list(coded?) of restaurants so as not to be able to associate a name with his/her test? I'm no actuary(quite far from it actually) but what you typed above actually supports what I stated below in my little mathematical exercise in a previous post.
I can provide another example:
I don't want to single out Rialto( I certainly believe it belongs higher on the top 50 than it gained) but it just happens to be listed right below Rendezvous on the spreadsheet alphabetically. Look at their numbers in comparison. Rendezvous has equal or higher ratings in EVERY category except the Globe review including a higher rating in the BoMag column (They both have identical zagat scores for those that can't see that). Rendezvous has the added benefit of a 4/4 review from the Phoenix to help(as you say "weigh more heavily in its favor"). How did Rialto end up with a higher overall rating than Rendezvous? It would appear the Globe review outweighed your opinions at BoMag. Could a Globe review for Rendezvous(3/4) 13 months older than Rialto's(4/4) sink it that much further? That seems preposterous especially since BoMag was supposed to be weighted double. I contend that a number of restaurants quite possibly have had a disservice done to them and some statistician could be to blame. It's time to check the statistician's work.I don't think it takes a mathematical genius to ponder that something isn't right with these scores.
I appreciate your feedback and look forward to hearing more insight from the inside.
I think the statistician owes you money back.
-
-
-
re: Wursthof
jhelterman at bostonmagazine.com; same format for Amy's.
Sorry, it's been weeks of 23-hour days trying to get the second and third issues of the redesigned magazine off to the printer (when you're learning how to build a new type of page, everything takes twice as long...). But haven't forgotten about the request. Amy and I are compiling a list of the excellent questions from your posts and will be sending out to our statistician...just as soon as our (well, my...) hair isn't on fire.
Thanks for your patience. Happy to respond to this forum (so that any other hound who's interested can read), but feel free to write to e-mails as well.
-
-
-
re: Wursthof
Getting specific answers to your questions.
What I am learning, though (as a bit of foreshadowing), is that not only were scores from reviewer to reviewer converted to an apples-to-apples system, but that each critic's scores were normalized---so that each reviewer has the same "mean." That is to say (and real statisticians can correct me on this...and, moreover, we'll have our own statistician explain it better), if a reviewer whose mean is 3 gives a 3.5, it counts more positively than a reviewer whose mean is 3.5 giving the same score. So eyeballing the scores won't necessarily alert you to problems with the computations unless you're comparing the same array of scores given by the same reviewrs.
Will try to get you a more official answer, as well as explain one or two examples but (honestly) we don't have the resources to get the statistician to "show our work" for every lsat example---wish we did.
ALSO: She did the calculations blind to both neighborhood and restaurant name.
Stay tuned.
-
re: Jolyon Helterman
Jolyon,
Thanks for your efforts.Converting to an apples to apples system still means someone assigned a value to a star or a letter.
I understand normalization of scores and how it works. It still doesn't hold water in my example of Rendezvous and Rialto comparisons.I know resources are slim, just have her prove my arguments in the above two posts. That certainly wouldn't be too much work ;)
Good to hear the analysis was done blind. Anyone with any integrity would have done it that way anyway.
Thanks again!
-
-
-
-
-
-
-
-
-
-









