Mathematics | George Shakan

In this somewhat different post, I am hosting the long-running Carnival of Mathematics. First I’ll talk about 223 (the issue number) and then I’ll round up some mathematical posts from December 2023.

It’s primetime we talk about 223. First of all, it is a lucky prime, to which it is unknown if there are infinitely many. To write the number 223 as the sum of fifth powers requires 37 terms, more than any other number. This is an example of Waring’s problem has a rich history going back to Diophantus nearly 2000 years ago. Also, 223 is the number of permutations on 6 elements that have a strong fixed point (see below for Python code).

And now for what happened last month.

Quanta brings current research mathematics to a general audience. Solutions to long-standing problems happen all the time in math, and this video highlights some from the past year. The one I know the most about is at the end. The “Dense Sets have 3-term Arithmetic Progressions” problem has been the subject of intense study for over 70 years. Multiple Fields medal winners over several generations have studied the problem and a big breakthrough was made by two computer scientists (including a PhD student from my Alma mater).
Quanta also posted their Year in Math. Particularly interesting to me is Andrew Granville’s (my former Mentor) discussion of the intersection of computation and mathematics.
Here you can find a “Theorem of the Day,” where the author impressively keeps posting interesting theorems from mathematics on a daily basis.
An interesting story of a Math SAT problem with no correct answer.
John Cook’s explanation of a Calculus trick. See if you can figure out how his trick is a manifestation of the fact that for a right triangle with angle $theta$ , $sin(\theta)$ only depends on the ration of the non-hypotenuse sides.
If you are interested in some drawings with math and a little bit of humor, you can find that here.
Another article from Quanta highlighting a very recent breakthrough. Interesting to me is how quickly (3 weeks!) the result was formalized into computers. What this means is a group of people got together and input all of the results into a computer which then tested the validity of the results in an automated way. This process of inputting the results into the computer can be quite painful and I am impressed by the speed at which it was done. You can check out my previous blog post for my experience with it. Incidentally, one of the classical results I helped formalized in that blog post was used to formalize the recent result. The referee and review process for math papers is particularly painful for mathematicians. It is not uncommon to see mathematicians submit a paper and wait years for the decision to accept or reject to come in. Meanwhile, especially for younger mathematicians, their career hangs in the balance as they await these decisions. Adopting Software Engineering best practices to this process, as done above, I believe go a long way to alleviate some of this anguish.

Here is the Python code for the strong fixed point verification.

from itertools import permutations
perms = permutations([1,2,3,4,5,6])

def fixed_points(perm):
    """
    Input: a permutation written as a tuple, i.e. (1,3,2) for 1->1, 2->3, 3->2
    Output: a list of the indices of the fixed point of the permutation
    """
    return [ind for ind,p_ind in enumerate(perm) if ind+1 == p_ind]

def is_strong_fixed_point(ind,perm):
    """ 
    Input: An index and a permutation 
    Output: True if the index is a strong fixed point of the permutation, else False
    See here: https://oeis.org/A006932
    """
    for lesser_ind in range(0,ind):
        if perm[lesser_ind] > perm[ind]:
            return False
    for greater_ind in range(ind+1,len(perm)):
        if perm[greater_ind] < perm[ind]:
            return False
    return True

counter = 0 
for perm in perms:
    for fixed_point in fixed_points(perm):
        if is_strong_fixed_point(fixed_point,perm):
            counter +=1 
            break 
print(counter)

Chat-GPT is a new impressive AI chatbot released by Open AI. Impressive applications of it can be found all over the internet. But can it do math?

By math, we do not mean simply perform computations. Its own design ensures that there will be computational problems it will be unable to solve. What I am more interested in is if it can solve problems that require some mathematical reasoning.

To choose our problems, we use the MMLU dataset. Galactica, a large language model recently released by Meta AI, has achieved some good results on this dataset. Their findings are in section 5.3 of their paper.

In what follows, we’ll pick a few problems and see how Chat-GPT handles them. If it fails, we will try to reword the question in hopes of achieving some partial results.

We note that the MMLU data set has multiple choice questions, but we just present the questions to chat-GPT. We just choose question from the abstract algebra section of the MMLU. Afterwards, we’ll prod chat-GPT for intuition and a proof. We’ll end by suggesting a few applications of chat-GPT for mathematicians.

Question 1

Me: The inverse of -i in the multiplicative group, ${1, -1, i , -i }$ is

Chat-GPT: The inverse of -i in the multiplicative group ${1, -1, i, -i}$ is $i$ . The inverse of an element $x$ in a group is an element $y$ such that $latex xy=yx=1$, where $1$ is the identity element of the group. In this group, $1$ is the identity element, so the inverse of an element $x$ is an element $y$ such that $x*y=1$ .

For example, the inverse of $i$ is $-i$ , because $i*(-i)=1$ . Similarly, the inverse of $-i$ is $i$ , because $(-i)*i=1$ .

Remark: Chat-GPT answers this question perfectly and gives an excellent explanation. To be fair, this is an easier question.

Question 2

Me: How many homomorphisms are there of Z into Z_2? Please write in latex format.

Chat-GPT: There are two homomorphisms of $\mathbb{Z}$ into $\mathbb{Z}_2$ . They are defined as follows:

$f_1:\mathbb{Z}\rightarrow\mathbb{Z}_2$ such that $f_1(n)=n\pmod{2}$ for all $n\in\mathbb{Z}$

$f_2:\mathbb{Z}\rightarrow\mathbb{Z}_2$ such that $f_2(n)=0$ for all $n\in\mathbb{Z}$

Remark: Again, Chat-GPT is spot on. However, I asked this question a few times and it once said 1, and another time said two but gave a false justification:

There are only two homomorphisms of Z (the integers) into Z2 (the integers modulo 2).

One homomorphism sends every integer to 0, and the other homomorphism sends every integer to 1.

Question 3

Me: Let p = (1, 2, 5, 4)(2, 3) in S_5 . Find the index of in S_5. Please write your answer in latex.

Chat-GPT:

The index of in S_5 is the number of elements in S_5 that come before in lexicographic order.

To find the index of , we must first determine the cycle type of p. The cycle type of a permutation is a list of the lengths of its disjoint cycles, listed in decreasing order.

The cycle type of p is [2,2], since p consists of a 2-cycle and a 2-cycle.

Now we can use the cycle index of S_5 to determine the number of permutations in S_5 with cycle type [2,2].

The cycle index of S_5 is given by:

$Z(S_5) = \frac{1}{5!} \sum_{i=1}^{5} a_i p_i^{5}$

where $a_i$ is the number of permutations in S_5 with i cycles, and $p_i$ is a variable representing a cycle of length i.

Substituting the cycle type [2,2] into the cycle index gives:

$\frac{1}{5!} (p_2^2 + p_2^2)$

This simplifies to:

$\frac{1}{120} (p_2^2 + p_2^2) = \frac{1}{60} (2p_2^2) = \frac{1}{30} p_2^2$

The coefficient of $p_2^2$ in this expression is $\frac{1}{30}$ , so the number of permutations in S_5 with cycle type [2,2] is $\frac{1}{30}$ .

Since is the only permutation in S_5 with cycle type [2,2], it is the $\frac{1}{30}$ -th permutation in S_5.

Therefore, the index of in S_5 is $\boxed{\frac{1}{30}}$ .

Remark: Woah! This is very wrong! The very first sentence is completely wrong. Also, its final conclusion, $1/30$ is not even an integer, which is certainly not right. Indeed, the index of a subgroup counts something and most be a positive integer.

We are able to salvage something though. Suppose we computed that the order of $$ is 5, using our algebra skills. Then we can prompt chat-GPT with the following.

Me: Now suppose $H \leq S_5$ is a subgroup of order 5. What is the index of $H$ in $S_5$ ?

Chat-GPT: The index of a subgroup $H$ in a group $G$ is the number of left cosets of $H$ in $G$ . This is also equal to the number of right cosets of $H$ in $G$ .

Since the order of a subgroup divides the order of the group, the order of $H$ must divide the order of $S_5$ , which is 120. The only divisors of 120 that are less than 120 are 1, 2, 3, 4, 5, 6, 8, 10, 12, 15, 20, 24, 30, 40, and 120. Therefore, the only possible orders for $H$ are 1, 2, 3, 4, 5, 6, 8, 10, 12, 15, 20, 24, 30, 40, and 120.

Since the order of $H$ is 5, the index of $H$ in $S_5$ is $\frac{120}{5} = \boxed{24}$ .

Remark: Here chat-GPT answers perfectly with a great explanation. Perhaps it would be nice to mention that the main fact being used is Lagrange’s theorem, though this can easily be obtained by asking it “What theorem did you use above?”

But what’s the intuition?

What makes a research mathematician great is a certain “intuition” for problems. Let’s see if chat-GPT can explain any of that.

Me: Can you explain the intuition behind the Cauchy-Schwarz inequality? Please write any math formula in latex.

Chat-GPT: Sure! The Cauchy-Schwarz inequality is a useful inequality that arises in various areas of mathematics, such as vector algebra, inner product spaces, and probability theory. It states that for any two vectors x and y, the following inequality holds:

$\left(\sum_{i=1}^n x_i y_i\right)^2 \le \left(\sum_{i=1}^n x_i^2\right)\left(\sum_{i=1}^n y_i^2\right)$

Intuitively, the Cauchy-Schwarz inequality states that the inner product of two vectors x and y is “controlled” by the lengths of the vectors x and y. To see this, consider the special case where x and y are unit vectors (i.e., vectors of length 1). In this case, the inequality becomes:

$\left(\sum_{i=1}^n x_i y_i\right)^2 \le \left(\sum_{i=1}^n x_i^2\right)\left(\sum_{i=1}^n y_i^2\right) = 1 \cdot 1 = 1$

which means that the inner product of x and y is always less than or equal to 1. This is consistent with our intuition that the inner product of two unit vectors should be “small” if they are “orthogonal” (i.e., perpendicular) and “large” if they are “parallel” (i.e., pointing in the same direction).

I hope this helps! Let me know if you have any questions.

Remark: This is not too bad, but could certainly be improved. For instance, explaining when equality holds in Cauchy-Schwarz (when one vector is a multiple of the other) helps give a more precise meaning the the last paragraph. One can see this first few paragraphs of this blog post of Terence Tao to see an explanation from one of the world’s leading mathematicians. Also, here is a math stack exchange on the topic.

Prove it.

Me: Can you prove it?

Chat-GPT: Sure! Here is a proof of the Cauchy-Schwarz inequality:

First, we can rewrite the inequality as:

$\left(\sum_{i=1}^n x_i y_i\right)^2 - \left(\sum_{i=1}^n x_i^2\right)\left(\sum_{i=1}^n y_i^2\right) \le 0$

Next, we can expand the left-hand side as follows:

$\sum_{i=1}^n x_i^2 y_i^2 + \sum_{i \ne j} x_i x_j y_i y_j - \sum_{i=1}^n x_i^2 y_i^2 - \sum_{i \ne j} x_i^2 y_j^2 \le 0$

This simplifies to:

$\sum_{i \ne j} (x_i y_i)(x_j y_j) - \sum_{i \ne j} (x_i y_j)^2 \le 0$

Now, we can apply the arithmetic-geometric mean inequality to each term on the right-hand side:

$\sum_{i \ne j} (x_i y_i)(x_j y_j) - \sum_{i \ne j} [(x_i y_j)^2]^{\frac{1}{2}}[(x_i y_j)^2]^{\frac{1}{2}} \le 0$

This simplifies to:

$\sum_{i \ne j} (x_i y_i)(x_j y_j) - \left(\sum_{i \ne j} [(x_i y_j)^2]^{\frac{1}{2}}\right)^2 \le 0$

Finally, we can apply the arithmetic-geometric mean inequality again to the term on the right-hand side:

$\sum_{i \ne j} (x_i y_i)(x_j y_j) - \left[\prod_{i \ne j} [(x_i y_j)^2]^{\frac{1}{2}}\right]^{\frac{2}{n-1}} \le 0$

This simplifies to:

$\sum_{i \ne j} (x_i y_i)(x_j y_j) - \prod_{i \ne j} (x_i y_j) \le 0$

And we are done! This completes the proof of the Cauchy-Schwarz inequality.

I hope this helps! Let me know if you have any questions.

Remark: Unfortunately, this proof is not quite correct. Both the proof method, as well as the details (i.e. its application of AM-GM) are severely flawed.

Conclusion

To be honest, Chat-GPT is quite impressive for a model that is not tailored for mathematics in any way. I think it could already, in its current format, help a mathematician to prepare lecture notes or even easier parts of a paper. Perhaps its greatest utility would come in automating certain parts of grants and job applications

	George Shakan on Can Chat-GPT Do Math?
	J.C Vega on Can Chat-GPT Do Math?
	Carnival of Mathemat… on Carnival Of Mathematics
	Playful Math Educati… on Carnival Of Mathematics
	Carnival of Mathemat… on Carnival Of Mathematics

George Shakan

Data Science and Math Blog

Category Archives: Mathematics

Carnival Of Mathematics

Can Chat-GPT Do Math?