# Math Fun

Fun with Polygons:  Recently, I ran across an incredibly charming example of how theory, application and computation blend together beautifully in linear algebra.  The example is the subject of the SIAM 2018 John von Neumann Lecture delivered by an outstanding researcher of numerical linear algebra, Charles Van Loan, where he made some fascinating observations on how to systematically “untangle” a randomly selected polygon, no matter how convoluted it is. The idea is as follows: Say that the polygon is described by distinct vertices (x0, y0),…,(xn, yn), where the last vertex is connected to the first. Consider this iterative process for constructing successive polygons:

1. Construct the initial polygon by a random selection of x- and y-coordinates for n+1 vertices, (x0,y0),…,(xn, yn).
2. Translate the polygon so that the centroid of the resulting polygon is at the origin (obtained by averaging each of the sums of x- and y-coordinates, dividing by n+1 and subtracting the results from each coordinate).
3. Use the vector 2-norm to normalize the size of the polygon so that the norms of the vectors of x- and y-coordinates coordinates are both one.
4. Plot this polygon in the xy-plane.
5. Construct a new polygon from the old by replacing each vertex (xi, yi ) by the midpoint vertex ((xi, yi) + (xi+1, yi+1))/2 of the segment connecting these two points of the polygon, where the indices are computed mod n+1.
6. Repeat steps 3-5 until the polygon shapes settle down to an approximately constant form.
The tools in this exploration are two ALAMA calculator programs (VanLoanExampleN.alm and VanLoanExampleNcplx.alm) which are available in the in the folder ALAMAprograms, or you may write your own programs that achieve the same results. The suffix ’N’ means that the standard normal distribution was used to generate it.  In addition, the suffix “cplx” means that the following alternative method was used in place of steps 3 and 4: View the x- and y-coordinates of a point as coordinates of the complex number z = x + yi, normalize the resulting vector of complex numbers with the complex 2-norm, plot the resulting polygon in the complex plane and construct a new polygon by averaging successive vertices. Here are pictures of the results where the number of vertices is 14 and 10 iterations of the process above are used to generate each figure using the real and complex versions of the algorithm. The figures are scaled up by factors of 3 and 4, resp., and colors for successive figures starting with the random polygon are black, red, green, blue, dark yellow and cyan:

Every time you repeat this experiment, you get similar results: The limiting polygon takes a familiar shape.  How to explain this settling down to a constant convex figure?  Linear algebra to the rescue, and I’ll leave the full explanation for the reader to explore, but here’s the first step: the averaging of step 5 can be viewed as a matrix/vector multiplication, where the vectors are vectors of x- and y- coordinates (or a single vector with complex coordinates x + yi in the case of the complex version).

Fun with Number Theory: I’m a fan of Great Courses offerings and recently I had the pleasure of watching the offering “Introduction to Number Theory” by Professor Edward Burger. The course was designed for those who may have been long away from their high school algebra (as it should have been), so it wasn’t terribly exciting for me personally until he made this delightful “come-on” in Lecture 2: You are assured that there is some power of 2 whose first nine digits are exactly your social security number. I had to wait till Lecture 21 to begin to see some justifications for this “powers of 2” assertion. I was intrigued by some unproved statements concerning this claim, so I went about writing my own proofs in the spirit of Professor Burger’s course: requiring no more than high school algebra and geometry. In fact, I ended up writing a brief essay, which includes a modest introduction to number theory complete with a few exercises. You can see it in the following pdf file, which you may  have to download if your viewer doesn’t show it.

FunWithNumbers

ALAMA Calculator helped me in two ways: First, experimentation with it gave me the clues that I needed to write proofs of the necessary results substantiating the “powers of 2” claim.  Secondly, I had a bit of fun verifying simpler versions of the claim which are included as exercises at the end of “Fun with Numbers”.  If you want to play around with these ideas there are a few ALAMA Calculator programs that you might find helpful, namely a sorting program, programs to construct/deconstruct continued fractions and just for fun, a gcd program.  Another program that I found useful in my experimentation was InsertionSortCndx.alm.  This program sorts a list of numerical data, but also preserves the initial indices of the sorted items, so that the initial positions of each sorted item held before sorting are known.  These are contained in the folder ALAMAprograms.

Fun with Coding: Speaking of “sorting”, some standard topics in an introductory computer science course are algorithms and their complexity.  The problem of sorting a list of numbers (or words or anything else that can be linearly ordered) can be used to illustrate these ideas.  In fact, I needed a sorting program to help me find proofs of certain theorems in the “FunWithNumberTheory” pdf.  I have claimed that ALAMA Calculator is fully programmable, so I decided to put my feet to the fire and write suitable sorting programs on it.  To add to the challenge, I decided that the programs should only use the single variable C that consists of the list of numbers to be sorted as a row vector augmented by two entries that are the start and finish of the section of C to be sorted.  One reason for this constraint is that ALAMA Calculator only has 10 named variables, so in case others are already in use, one should try to use as few extra variables as possible.

So, what algorithms to use?  I decided to try two well known methods:  insertion sort and quicksort, which resulted in the ALAMA programs InsertionSortC.alm and QuickSortC.alm.  (Aside to programmers: Along the way I realized that it was very helpful to expand the functionality of the Rpt() command, which I did in version (1.1.1).)  Given that n items are to be sorted, the insertion sort algorithm is known to have best, average and worst case complexities of  O(n), O(n^2) and O(n^2), respectively.  For quicksort, the corresponding complexities are O(n log(n)), O(n log(n)) and  O(n^2). Accordingly, one is tempted to always use quicksort instead of insertion sort.  On average, this is ok, but it’s  instructive to experiment with different lists of items and compare the timings for each method.  Try this experiment on ALAMA Calculator:

B = [RndN([1:400]), 1, 400];
C = B;

Now load the program InsertionSortC.alm and execute the command Rpt(1,1), but also use a timer to determine time of execution.  Next, set C equal to B again (to insure that we’re comparing sorting of the same list) and repeat this experiment with the program QuickSortC.alm. My timing results on my mac were about 22 second for insertion sort and about 3 seconds for quicksort.  This is what you’d expect, right?  Next, repeat the whole experiment with these initial values: B = [Ones(1,400), 1, 400]; C = B; My timing results were surprising: about 6 seconds for insertion sort and 25 seconds for quicksort.  One learns in a detailed study of these algorithms that quicksort has more of a problem with many equal items in the list.  Moral:  Matters are a bit more nuanced than one might expect if one depended only on the average complexities of these algorithms.

Fun with Probability: Here’s a not-too-uncommon situation: You’ve shown symptoms of a certain disease, so your physician had you take a test for the disease.  The results come back positive:  Should you be worried?  I collected data for a very specific example. To the best of my knowledge, the data are reasonably accurate.

Example.  The following estimates are drawn from various sources of data. It is estimated that for women in the United States under the age of 40 , the mammogram test for breast cancer has a sensitivity of 71.3% (the probability of the test returning a positive result, given that you have the disease) and a specificity of 85.9% (the probability that the test will return a negative result, given that you do not have the disease).  It is also estimated that breast cancer for these women occurs in approximately 115 out of 100,000 individuals. Now suppose that you are a woman under the age of 40 and your mammogram tests positive for cancer. How concerned should you be?

Well, that depends, doesn’t it?  The message of this exercise is that you should be cautious about rushing to judgment — initial intuitions can be inaccurate.  I’m just going to give you the answer, which I computed with a simple one-line ALAMA Calculator program (BayesTest.alm): Your likelihood of having breast cancer is about 0.58%.  This is a rather surprising result – not to say that there should be no concern about the outcome of this test, but the possibility of a false positive is very high, namely approximately 99.42%. So how and why is this so?  Well, if you’re know enough enough probability theory that you’ve seen Bayes’ Theorem and its applications, you can guess the answer.  If not (or you’ve long forgotten the relevant material), I’ve written an elementary introduction to probability theory that concludes with a full explanation of the above example:

FunWithBayes