It turns out that I was wrong about the role of the normalizing ratio-of-factorials in calculating the BIC (see the original post for details). Worse, I was wrong for what in retrospect is an obvious flaw in my thinking. Better, somewhat, is the fact that my ratio-of-factorials calculator works. Worse, again, is the fact that it took me as long as it did to solve what wasn’t really a difficult problem, I don’t really have a use for it, and writing about it is as good as pointing out, with mathematical rigor, exactly how dumb I’m capable of being. Better, perhaps, is the fact that I have a very small audience (1 +/-1). If I ever come to write posts for a popular blog, it will be nice to have worked out such kinks well before hand in relative anonymity.

To recap, the BIC (or Bayesian Information Criterion) is a model fit statistic that takes into account the goodness of fit and the complexity of the model needed to achieve the fit. It is defined as

-2*log(*L*) + *k**log(*N*)

where the left term is (twice) the (negative) log likelihood of a model, and the right term is the product of the number of free parameters (*k*) and the log of the sample size (*N*). The lower the BIC, the better the fit. If you exponentiate the BIC, you get

*N*^{k}/*L*^{2}

The likelihood I am dealing with (the multinomial likelihood) has two (multiplied) parts, and I was worried that one part (the normalizing constant) could cause problems by (when present) multiplying only the left, ‘fit’ term of the BIC, leaving the right, ‘complexity’ term alone. I “figured” that the BIC *without* the normalizing constant could lead to one conclusion, while the BIC *with* the normalizing constant could lead to a different conclusion. Now that I’ve thought about it more carefully, I see that the normalizing term in the BIC is an additive constant, since it’s a multiplier in the ‘raw’ likelihood and the BIC contains the *log* likelihood. The end result being that the BIC will give the same answer with or without the normalizing constant.

Anyway, fueled by erroneous thinking, I had, in the back of my mind, been working out how to overcome the computational difficulties presented by the multinomial normalizing constant for some time. Well, I figured out how to do it, and it seems quite simple in retrospect. The multinomial normalizer is a ratio of factorials, but the factorials I need to calculate are too big for direct calculation. The solution is to switch the order of operations around, put the elements of the factorials into vectors, and divide element by element before carrying out the multiplication.

Better yet, if you take the logs of the elements of the factorials first, and substitute subtraction and addition for division and multiplication, you can work with much larger factorials and still get the desired multinomial normalizer.

The end result is that I’m glad I worked out how to calculate the normalizer for the multinomial likelihood, but it would be nice to have had a good reason to do it.

]]>I’m also trying to figure out what, exactly, to blog about. Josh and I have discussed co-blogging about language, and I like the idea, but given how irregularly I blog on my own, I’m hesitant to commit to upkeep on a co-blog. Then again, perhaps committing to such an enterprise would be just the kick in the pants I need to blog as regularly as I (tell myself I) want to.

I think that blogging about politics is not really my game. Although there is a regular supply of political material to respond to, it’s hard to justify spending as much time as I would need to in order to write about issues as thoughtfully as I feel should be done. Josh does a much better job with this kind of blogging than I do. He has a much vaster store of historical and political knowledge, which (along with the practice afforded by regular blogging) allows him to write thoroughly and thoughtfully about new items as they occur. Given my relatively limited ability to blog about politics, and given how busy I am writing papers for publication, getting ready for a conference presentation, and (as of today) taking a course in probability theory, I just can’t commit too much time to this kind of blogging.

So, instead, I will blog about that which I am working on anyway. I’ve always intended my blog to be about research, mostly on mathematical models of perception and decision making, but also on a variety of issues that arise in conjunction with this. Hence, today’s post on factorials.

I employ multidimensional signal detection theory in the study of auditory perception. Briefly, this means I collect and analyze identification-confusion data in tasks in which each stimulus has one of two levels on each of two dimensions (e.g., purple or red, square or rectangle), and each combination of levels-on-dimensions (i.e., each stimulus specification) has a unique response. The general method extends to more levels and more dimensions, but for a variety of reasons, I stick with two-by-two (and lower) structures.

I like to analyze my data by fitting (and comparing) models. I take a given subject’s data and try to find the set of bivariate normal densities and decision bounds (more on this in another post) that most closely ‘predicts’ the observed counts of identifications and confusions. Each trial in one of these experiments consists of stimulus presentation and response execution. Because the response set is the same across trials, the data (i.e., the counts of the four respones) are distributed as multinomial random variables. Here’s where factorials come into the picture.

Pretty much any fit statistic involves a likelihood function. The multinomial likelihood function is proportional to the product of the parameters raised to the appropriate powers (i.e., the counts of the responses). So, for presentations of a red-square stimulus, the data would be the number of times each response was given, so the multinomial likelihood would be the product of the predicted probability of each response raised to the number of times that response was actually made (I would like to have this written out mathematically, but I can’t figure out right now how to get the sub- and super-scripts working).

In order to make it a properly normalized likelihood function, you have to multiply this product by a ratio of factorials, specifically, the factorial of the total number of responses divided by the product of the factorials of each individual response. Now, for a variety of reasons (again, more another time), I collect a *lot* of responses in these experiments. So many, in fact, that I can’t calculate the requisite factorials. If I were content to use regular, old fashioned likelihood ratio model testing, this wouldn’t matter, as the ratio of two likelihoods for the same data set have the same normalizing constant, hence, it cancels, and there is no need to calculate the factorials.

I’m not content to use regular, old fashioned likelihood ratio tests, though. Instead, I use the assuredly fancy-pants fit statistic known as the Bayesian Information Criterion (BIC), defined as -2*log(*L*) + *k**log(*N*), where log is the natural logarithm, *L* is the likelihood, *k* is the number of free parameters in the model, and *N* is the sample size. The basic idea behind the BIC is that it measures fit (the first term) and model ‘complexity’ (second term). The better your model fits, the lower the negative log likelihood, and so the lower the BIC, but the more parameters you need to get that fit (and the larger your sample size), the higher the BIC.

The BIC makes use of a rather crude measure of complexity (hence the scare quotes), but it relates directly to some other handy tools (e.g., minus half times the difference between to BIC values gives you the Bayes factor, which is a pleasantly intuitive (rare in statistics) measure of the relative goodness of fit of two models – the Bayes factor esentially tells you how much more belief-worthy one model is relative to another. Of course, you immediately encounter the same old issues of ‘how big (or small) is big (or small) enough to warrant a strong conclusion, but that seems inescapable).

The point, finally, is that while the normalizing constant is the same within a given subject’s data (and all my analyses are at the individual subject level), it only multiplies the likelihood, leaving the complexity term alone. Thus, though it is perhaps unlikely, it is possible that leaving out the normalizing constant and its many factorials could lead me to the wrong conclusions. Here’s a contrived example: Suppose (unnormalized) *L* = -2000 for one model, and *L* = -2400 for another, and the complexity terms for the two models are 300 and 200, respectively. Plugging these into the BIC formula gives 4300 and 5000, leading to preference for the first model. Now suppose that the normalizing constant (for both) is 1/20. Including this in calculating the (proper) likelihood values leads to BIC values of 500 and 440, respectively, leading to preference for the second model. Oops.

So, the end result is that I will spend a good chunk of time today figuring out a way to calculate the ratio of factorials so that I can normalize my likelihoods appropriately. The basic idea (which shouldn’t take long to execute – less time, perhaps, than it has taken me to blog about it) is to create vectors with the elements that are to be multiplied in the factorials for the numerator and denominator and divide them element-wise prior to taking the product. Which is to say, I am going to violate the order of operations handed down by Moses as he descended from Mt. Sinai. I’ll also do this for the log likelihood, only with adding and subtracting. I’ll post again with updates (and I’ll get to the posts I promised last time at some point, too). I don’t guess this will change the results of my analyses, but I don’t know yet for sure.

]]>My plan in the long run is to have my own fancy-pants personal website with this blog appearing somewhere therein. If I like wordpress (and so far, I do), I’ll use their software there. For the time being, I’ll be using it here, and, once I post a notice to the original sourcefilter blog, I will only be posting here.

Coming soon: a post on noise, signals, and data processing; a post on brain imaging and cognitive science.

]]>Anyway, Josh points to some hints that Bob Barr will be the Libertarian Party presidential candidate, which reminded me of a recent Cato-at-liberty post on how non-libertarian Bob Barr was as a representative.

]]>1. Josh linked to this blog today, and on the off chance that he has readers that follow that link, I wanted my blog to be less pathetic than a year old post about which serenity character I am most like would suggest it is.

2. I’ve gotten back into the habit of reading Pharyngula, and there was a silly thing written there today.

Specifically, PZ Myers wrote, in support of a new law in California making gay marriage legal, that “if you want to do something more substantive, promote equal rights legislation in your state, so that all 50 states someday offer this basic privilege to everyone.”

The silliness resides in the idea that government can (and should) be offering a ‘basic privilege’ to everyone.

Privilege, by definition, does not get offered to everyone. This may simply be a semantic nit to pick, but it caught my attention because it is typical of left-leaning gay marriage advocates to discuss gay marriage in terms of rights, not privileges.

In any case, the government should only be in the marriage business insofar as marriage is a form of contract and the legal system may be called upon to protect one or another party’s property interests. It is clear to me that pairs of gay adults, like pairs of any adults, should be allowed to enter into any contract, as long as they do so by choice.

That’s all for now. I hope to blog more regularly in the (near) future, though it’s almost certain I won’t be doing so as prolifically as Josh has been lately.

]]>You scored as The Operative.

You are dedicated to your job and very good at what you do. You’ve done some very bad things, but they had to be done. You don’t expect to go to heaven, but that is a sacrifice you’ve made for a better future for all.

Which Serenity character are you? |

All in all, it looks good. Unlike Josh, neither ‘A Reaver’ nor ‘Alliance’ appear on my list at all, but then again, I’m apparently most like the completely amoral Operative (with a close second for Zoe!).

Good fun, and it’s given me a reason to revive my blog (at least in the short term).

**Update**: My wife’s results (this says something about our family, I’m sure, but who knows what exactly):

Which Serenity character are you? |

Expository dialogue and tricky cinematography almost make it a Tarantino.

]]>To make a long(ish) story short(ish), Glenn Greenwald was disturbed to see the Washington Post praising recently deceased Chilean ex-dictator Augusto Pinochet. He drew parallels between US support for Pinochet’s foreign lawlessness back then and support for domestic lawlessness today (note that he did not draw explicit parallels between Bush and Pinochet – he’s not dumb, and he’s not dishonest [Greenwald, not Bush or Pinochet]). I felt that the Post’s editorial was less awful than Greenwald felt it was, and I posted a comment to that effect.

I argued that, as a historical case study (as opposed to a model on which to base one’s own plans), Pinochet’s ‘free-market’ economic policies are distinct from the violent poitical oppression of his regime. I made some facile comparisons between Castro and Pinochet and argued that the relative stability of Chile over the years was due, at least in part, to Pinochet’s economic policies.

Others shot back that Pinochet’s economic policies weren’t even that beneficial, that they don’t justify the political oppression (which I explicitly agreed with, even before this ‘objection’ was made to my argument), that welfare states ‘just work’, that laissez faire capitalism is equivalent to Dicken’s London, and that I am a lying Nazi-sympathizer (way to respect the level of discourse that Glenn studiously maintains, ‘truth machine’!).

I don’t actually know that much about Pinochet’s economic policies. It may well be the case that they were not good for Chile. It does seem to be the case that Chile has been more economically stable, and more economically healthy, than most other Latin American countries for much longer, but I’m happy to admit that this could be for reasons independent of Pinochet’s economics. I remain unconvinced that welfare states ‘just work’ and that laissez faire capitalism is a bad idea. In addition, I value honesty very highly and, for what it’s worth, I’m not a big fan of the Nazis.

All that said, it’s kind of embarrassing to admit that this morning – a full two days after getting into the discussion at Unclaimed Territory – it occurred to me that Pinochet’s economics and politics are not, in fact, separate. I am pro free market primarily because I don’t like the idea of someone else making my decisions for me. It seems to me that no government official, whether democratically elected or installed from abroad, has the wisdom to plan an economy better than the mass of humanity participating in a market can. There’s certainly no reason to think that any government officials are better suited than individuals are to make day to day decisions about who to associate with, what to buy, what to sell, or how hard to work. I think everyone would be better off, at least in the long run, if they had the opportunities afforded them by free markets.

It should have been obvious to me on Tuesday that imprisoning, torturing, and murdering political opponents is 100% antithetical to these values. It is as clear as day (today anyway) that Pinochet’s political oppression of Chileans represents an utter lack of respect for private property, a crucial underpinning of any truly free market. After all, if a person’s self is not owned by that person, then what is?

]]>