We experimented a bit more with
Open AI.
1. Basic algebra. Especially PEMDAS
This is a famous problem
(
see this page)
What is 2x/3y-1 if x=9 and y=2?
Open AI Answer:
2x/3y-1 = (2*9)/(3*2)-1 = 6/6-1 = 0
Grading:
This is a big surprise because the AI only got an inexplicable basic arithmetic error
and not any other issues. It was almost there and should have got
2x/3y-1 = (2*9)/(3*2)-1 = 18/6-1 = 2
which is the answer most humans get. If we ask
Compute (2*9)/(3*2)-1 the AI gets indeed 2.
This problem came up, because almost all
computers and programming languages give 11. Here in mathematica
x = 9; y = 2; 2x/3y - 1
which gives the answer 11.
2. Basic calculus
This is still disappointing. We would expect that the AI would
just feed it into a computer algebra system:
What is the anti derivative of exp(-x^2)
Open AI Answer:
The anti derivative of exp(-x^2) is -1/2 exp(-x^2) + C, where C is
an arbitrary constant.
This is completely off.
3. Multivariable calculus exam problems
Practice Exam 1 Problem 2, Math 21a, Fall 2022
Find the global maximum and minimum values of the function
f(x,y,z) = x z + y^2+ 2 y on the region x^2+y^2+z^2 <=4.
Open AI Answer: Global max 8, global min 0,
Global Maximum: 8 TRUE
Global Minimum: 0 FALSE -8/3 is the minimum
here is a generic word problem. The formulation helps a bit
Practice Exam 1 Problem 6, Math 21a, Fall 2022
The post office will only ship cuboid boxes of dimension x,y,z for which the height plus girth z+2x+2y does not exceed 108. What dimensions will result in the
greatest volume x*y*z?
Open AI Answer:
The greatest volume will be achieved when x, y, and z are all equal.
In this case, the maximum dimensions would be x = y = z = 36,
resulting in a volume of 46,656.
Grading:
Answer is wrong. This is a Lagrange problem. The best is (x,y,z) = (18,18,38)
f=x*y*z; g=z+2x+2y; c=108;
Solve[{D[f,x]==L*D[g,x],
D[f,y]==L*D[g,y],
D[f,z]==L*D[g,z], g==c},{x,y,z,L}]
Here is a True false problem
True False Practice Final Exam 3 problem, problem 8a
For every scalar function f there is a vector field F such that div(F)=f.
Open AI Answer:
Yes, this is true. The vector field F is called the gradient of f and is
defined as F = ∇f. The divergence of the gradient of a scalar function
is equal to the scalar function itself, so div(F) = div(∇f) = f.
Grading:
Correct answer but completely wrong explanation. div grad (f) = 0 can not be
f. The trick is to restrict F to F=[P,0,0] and solve P_x = f.
And an other one
True False Practice Final Exam 3 problem, problem 8b
If f(x,y) is a function with no critical points, then f cannot attain a global
maximum on x^2+y^2 <= 1.
Open AI Answer:
This is not necessarily true. A function with no critical points can
still have a global maximum on x^2+y^2 <= 1. For example, the function
f(x,y) = x^2 + y^2 has no critical points, but it has a global maximum of
1 on x^2+y^2 <= 1.
Grading: This is correct. A better explanation would be to cite the extremal
value theorem which assures that a continuous function on a closed and bounded
domain attains a maximum and minimum.
Putnam 2022 exam problems
We only looked at some problems where we suspect success is possible:
Putnam Exam problem A1 2022
Determine all ordered pairs of real numbers (a, b) such that the line y = ax+b
intersects the curve y = ln(1+ x^2 ) in exactly one point.
Open AI Answer: (1, -1) and (-1, 1).
Answer is partially correct. There are more examples like (0,0)
Plot[ {1 - x, x - 1, Log[1 + x^2]}, {x, -5, 5}]
An other putnam
Putnam Exam Problem B6 2022
Find all continuous functions f from the positive real numbers to the
positive real numbers such that
f(x f(y))+ f(y f(x)) = 1+ f(x+y) for all positive x and y.
Open AI Answer
The only continuous function that satisfies this equation is f(x) = 1.
Answer is partially correct. There are more examples
Here is a tougher problem:
Putnam Exam Problem A5 2022
Alice and Bob play a game on a board consisting of one row of 2022
consecutive squares. They take turns placing tiles that cover two
adjacent squares, with Alice going first. By rule, a tile must not cover
a square that is already covered by another tile. The game ends when no
tile can be placed according to this rule. Alice's goal is to maximize
the number of uncovered squares when the game ends; Bob's goal is
to minimize it. What is the greatest number of uncovered squares
that Alice can ensure at the end of the game, no matter how Bob plays?
Open AI Answer:
The greatest number of uncovered squares that Alice can ensure at
the end of the game is 1011.
Grading:
Answer is false. Bob can put many tiles in which are not adjacent.
The correct answer is 290. There is a recursion for the optimal final
score a(n) of Alice and the optimal final score b(n) of Bob.
We have a(n) = f(n)+a(n-f(n)), b(n) = f(n)+b(n-f(n)), with f(n)=floor(n/7)
Problem solving examples
Problem in Polya's How to solve on page 38:
Find the center of gravity of a homogenous tetrahedron.
Open AI answer:
The center of gravity of a homogenous tetrahedron is located
at the intersection of the three medians, which is also the
centroid of the tetrahedron. The centroid is the point where
the four vertices of the tetrahedron meet and is located at
the center of the tetrahedron.
Grading:
correct answer.
Here are some tougher cookies:
Problem 2.5 in Tao's book Solving mathematical problems
Prove that for any non-negative integer n the number
1^n+2^n+3^n+4^n is divisible by 5 if and only if n is
not divisible by 4
Open AI Answer:
First, assume that n is not divisible by 4.
We can write 1^n+2^n+3^n+4^n as (1+2+3+4)^n.
Since n is not divisible by 4, (1+2+3+4)^n is divisible by 5.
Grading: Proof is false. It is not true that
1^n+2^n+3^n+4^n=(1+2+3+4)^n if n is not divisble by 4.
For example n = 3, 1^n + 2^n + 3^n + 4^n=100.
And an other one from that book
Problem 2.2 in Tao's book Solving mathematical problems
Show that there is no power of 2 such that its decimal
digits could be rearranged and made into another power of 2.
No zeroes are allowed in leading digits.
Open AI Answer:
We will prove this by contradiction. Suppose there is a power
of 2 such that its decimal digits can be rearranged and made
into another power of 2. Let the power of 2 be 2^n and the
rearranged power of 2 be 2^m.
Since the rearranged power of 2 is also a power of 2, we know
that m = log_2(2^m) = log_2(2^n) = n. This means that the
Grading: Proof is not only incomplete but also can not be done.
We need an invariant like digit sum hash. This is periodic with period
6 with no repetition. But since 2^6 is larger than 10 the number of
digits can not be the same.