Some Mathematical Farts

Idempotent elements in Rings

April 7, 2018April 8, 2018 YatimaLeave a comment

For a commutative ring $R$ an idempotent is an element $e \in R$ such that $e^2 = e$ . The collection $Re$ is an ideal of $R$ and in fact, moreover, is a ring in its own right with identity $e$ (why?). A good example of an idempotent to have in mind is the element $(1,0) \in R \times R'$ and in some sense, these are the only idempotents. $R'$ is another commutative ring.

For idempotent $e \in R$ we have $e' =1 -e \in R$ also being an idempotent and $ee' = 0$ . These should remind you of elements $e = (1,0), e' = (0,1) \in R\times R'$ . We call a pairing $\{e, e'\}$ such that $e, e' \in R$ and $e' = 1 - e$ and $ee' = 0$ complementary idempotents.

Complementary idempotents give a formulation of an inner product. What do I mean by this? If $\{e,e'\}$ is a pair of complementary idempotents in $R$ , then defining $R' = Re$ , $R'' = Re$ (remember these are rings in their own right) we have that $\phi \colon R \to R' \times R''$ where $r \mapsto (re,re')$ being a ring isomorphism. This is an instructive check to make and I leave it to the reader.

What do idempotents of some rings look like? $1$ is always idempotent, making the pair $(1,0)$ complementary idempotents. In all fields, the only non-zero idempotent element is the multiplicative unit $1$ .

What about rings $\mathbb{Z}/n\mathbb{Z}$ ? if $n = p^k$ for some prime number $p$ , supposing that $a + p^k\mathbb{Z}$ is an idempotent, we have $p^k | a(1-a)$ and so if $k = s+ t$ such that $p^s | a$ and $p^t |1-a$ if both $s, t \geq 1$ then $p| a + (1-a)n = 1$ which is a contradiction, so $p^k | a$ or $p^k | 1-a$ . The only idempotents $\mathbb{Z}/p^k\mathbb{Z}$ are $0$ and $1$ . The only idempotent pair is $\{1,0\}$ .

We can now look at the general case, let $n = {p_1}^{r_1} \ldots {p_m}^{r_m}$ be $n$ ‘s prime factorisation, the chinese remainder theorem tells us $\mathbb{Z}/n\mathbb{Z} \cong \mathbb{Z}/{p_1}^{r_1}\mathbb{Z} \times \cdots \times \mathbb{Z}/{p_m}^{r_m}\mathbb{Z}$ so there are $2^n$ idempotents, and $2^{n-1}$ pairs of complementary idempotents.

Lagrange Interpolation, a unique polynomial

March 19, 2018 YatimaLeave a comment

Let $z_1, \ldots, z_k$ be $k$ distinct complex numbers and let $w_1, \ldots, w_k$ be $k$ complex numbers which need not be distinct. A natural question to ask is, can we find a polynomial $P(z)$ such that $P(z_m) = w_m$ for $m = 1,\ldots, k$ ?

We can and in fact we can say more, there is a unique polynomial of degree $\leq k-1$ satisfying the above! This polynomial is called the Lagrange interpolation polynomial and is constructed explicitly.

We start by setting $A(z) = (z - z_1)\cdots(z-z_k)$ and let $A_m(z) = \dfrac{A(z)}{z-z_m}$ .

What have we just constructed? $A_m(z)$ is a polynomial of degree $k-1$ with $A_m(z_m) \neq 0$ and $A_m(z_i) = 0$ if $i \neq m$ .

So $\dfrac{A_m(z)}{A_m(z_m)}$ is another polnomial of degree $k-1$ which vanishes at $z_j$ for $j \neq k$ , and takes $1$ at $z_m$ . This polynomial picks out an idividual point and ignores all the rest, this is precicely what we need.

Let us define $P(z) = \sum_{m=1}^{k} w_m \dfrac{A_m(z)}{A_m(z_m)}$ which is a polynomial of degree $\leq k-1$ with the properties we want.

What about uniqueness? Suppose we have another such polynomial $Q(z)$ , the difference $P(z) - Q(z)$ vanishes at $k$ distinct points, with degree $\leq k-1$ so it must be $0$ and hence $P(z) = Q(z)$ .

Algebras and why modules are a thing

March 16, 2018March 16, 2018 YatimaLeave a comment

A $k$ -algebra $A$ where $k$ is a field, is a ring with the added structure of being a vector space over $k$ (with respect to the same addition) where scalar multiplication behaves nicely with the ring multiplication. $\mu(ab) = (\mu a)b = a(\mu b)$ for $\mu \in k$ and $a, b \in A$ . The atypical example is a matrix algebra of dimension $n$ with matrix elements in $k$ denoted $M_{n}(k)$ . There is a notion of algebra morphism which is not too hard to figure out with the above or a quick google, so we have a notion of isomorphism.

Matrix algebras are very special, they have vectors in which they can act on. For an $n$ -dimensional matrix algebra these objects are the tuples $(x_1, \cdots, x_n)$ where $x_i \in k$ . As a collection, they are denoted $k^n$ . These form a vector space and as algebras $M_{n}(k) \cong \text{Aut}(k^n)$ . Where the algebra operation in $M_n(k)$ is matrix multiplication and $\text{Aut}(k^n)$ is the algebra of vector space automorphisms $k^n \longrightarrow k^n$ with the algebra operation being composition.

The definition of an $A$ -module can be found elsewhere but to motivate its origins, it is precisely the articulation of assigning some ‘vectors’ in some vector space $V$ to a general $k$ -algebra $A$ which $A$ can act on $V$ similarly to matrices on tuples. If you struggle to remember the definition of an $A$ -module just think like this and you should be able to reconstruct it.

Modules can even be defined for a ring $R$ , with the relaxation of the structure of a vector space. And so we can assign ‘vectors’ to rings by looking at $R$ -modules. So we can ask the question, what are $\mathbb{Z}$ -modules? These are in fact abelian groups (why?) and so the collection of possible ‘vectors’ for $\mathbb{Z}$ can be thought of as just abelian groups.

I think that’s pretty nice.

It’s all local for topological vector spaces

February 25, 2018January 15, 2019 YatimaLeave a comment

For a vector space $V$ and a topology $\tau$ on the space, this pair is called a topological vector space when $\{v\}$ is closed for all $v \in V$ and both vector space operations are continuous. This being addition and scalar multiplication.

Defining maps $T_\alpha : V \longrightarrow V$ by $T_\alpha(v) = \alpha + v$ where $\alpha \in V$ it’s not hard to show that $T_\alpha$ is in fact a homeomorphism. This gives us that the topology is uniquely defined local, around some point say $0 \in V$ because $E \subset V$ is open iff $\alpha + E$ is open for all $\alpha \in V$ .

I think this is pretty neat and possibly the atypical situation where things locally define everything globally thus illustrating a guiding philosophy in modern mathematics.

Cauchy’s inequality and the appeal to symmetry

February 9, 2018 YatimaLeave a comment

Working over the reals $\mathbb{R}$ a young mathematician will learn the inequality $(a_1b_1 + \cdots + a_nb_n)^2 \leq (a_1^2 + \cdots + a_n^2)(b_1^2 + \cdots + b_n^2)$ which has its name associated to Cauchy and Schwarz and is called the Cauchy -Schwarz inequality.

After proving this result the question of when this inequality is an equality is brought up and the young mathematician will learn this occus when $a_i = \lambda b_i$ for some $\lambda \in \mathbb{R}$ and for all $i \in \{1,\cdots,n \}$ . Here is a neat niave proof of this result following the mathematicians’ philosophy of ‘look for symmetry’.

Let us look at the ‘error’ $E_n = (a_1^2 + \cdots + a_n^2)(b_1^2 + \cdots + b_n^2) - (a_1b_1 + \cdots + a_nb_n)^2.$

Expanding out gives us $E_n = \sum_{}^{}a_i^2b_j^2 - \sum_{}^{}a_ib_ia_jb_j$

And since $\sum_{}^{}a_i^2b_j^2 = \frac{1}{2}\sum_{}^{}(a_i^2b_j^2 + a_j^2b_j^2).$ Yes I did just write this and no, I’m not insulting your inteligence. This is my apeal to symmetry.

We get $E_n = \frac{1}{2}\sum_{}^{}(a_i^2b_j^2 + a_j^2b_j^2) - \sum_{}^{}a_ib_ia_jb_j = \frac{1}{2}\sum_{}^{}(a_i^2b_j^2 + a_j^2b_j^2 - 2a_ib_ia_jb_j )$

And so $E_n = \frac{1}{2}\sum_{}^{}(a_ib_j - a_jb_i)^2.$ I’ll let you finish the rest off but we’ve done all the hard work here.

A theorem by Cauchy on ‘thinning’ sequences

October 12, 2017October 12, 2017 YatimaLeave a comment

Just to clarify all the numbers in this post are going to exist in $\mathbb{R}$ . Given a sequence $\{a_n\}$ where $n \in \mathbb{N}$ we have a collection of partial sums $\{s_n\}$ indexed by $\mathbb{N}$ and defined by $s_n = a_1 + a_2 + ... + a_n$ . If the sequence $\{s_n\}$ converges to $s$ (In the usual $\epsilon$ – $\delta$ way) we say the series converges and write $\sum_{n = 1}^{\infty} a_n =s$ . For completness if the sequence $\{s_n\}$ diverges, the series is said to diverge.

If you know sequences really well, you know series really well as every theorem about sequences can be stated in terms of series (putting $a_1 = s_1$ and $a_n = s_n - s_{n-1}$ for $n > 1$ ). In particular the monotone convergence theorem has an instantaneous counterpart for series.

Theorem: A series of non-negative real numbered terms converges if and only if the partial sums form a bounded sequence.

I’m going to omit the proof here but it is a quick application of the monotone convergence theorem to the partial sums. So why bring this up? Well if we impose that the terms in our series are decreasing monotically (which can appear in applications) we can apply the following theorem of Cauchy. What is interesting about this theorem is that a ‘thin’ subsequence of $\{a_n\}$ determines the convergence or divergence of the series.

Theorem: Suppose $a_1 \geq a_2 \geq ... \geq 0$ are real numbers. Then the series $\sum_{n=1}^{\infty} a_n$ converges if and only if the series $\sum_{k=0}2^k a_{2^k}$ converges.

Proof: By the previous theorem it suffices to consider only the boundedness of the partial sums. Let us write $s_n = a_1 + ... + a_n$ and $t_k = 2a_2 + ... + 2^ka_{2^k}$ . We will look at two cases, when $n < 2^k$ and when $n > 2^k$ .

For $n < 2^k$ we have $s_n \leq a_1 + (a_2 + a+3) + ... + (a_{2^k} + ... +a_{2^{k+1} - 1}) \leq a_1 + 2a_2 + ... + 2^ka_{2^k} = t_k$ where the first inequality followed from $n < 2^k$ and the second inequality from the hypothesis.

When $n > 2^k$ we have $s_n \geq a_1 + a_2 + (a_3 + a_4) + ... +(a_{2^{k-1} +1} + ... a_{2^k}) \geq \frac{1}{2}a_1 + a_2 +2a_4 + ... + 2^{k-1}a_{2^k}$ $= \frac{1}{2}t_k$ where the first inequality follows from $n > 2^k$ and the second (you guessed it) follows from our hypothesis.

Bringing these together we conclude that the sequence $\{s_n\}$ and $\{t_k\}$ are either BOTH bounded or BOTH unbounded which completes the proof.

When I came across this I thought it was pretty astounding (hence why it has made it onto the blog) so lets see it it action. We will use it to deduce for $p \in \mathbb{Z}$ that $\sum_{n=2}^{\infty} \frac{1}{n(log \:n)^p}$ converges if $p >1$ and diverges if $p \leq 1$ .

The monotonicity of the logarithmic function implies $\{\mbox{log } n \}$ increases which puts us in good position to apply our theorem. This leads us to the following which is enough as a proof.

$\sum_{k=1}^{\infty}2^k\frac{1}{2^k(log \:2^k)^p} = \sum_{k=1}^{\infty} \frac{1}{k(log \:2)^p} =\frac{1}{(log \; 2)^p} \sum_{k=1}^{\infty} \frac{1}{k^p}$ .

p-Sylow subgroups and why they exist

August 28, 2017September 2, 2017 YatimaLeave a comment

Let $G$ be a finite group and $p$ a prime such that the order of $G$ is $|G| = p^nm$ where $p$ and $m$ are coprime. Cauchy’s Theorem for finite groups tells us that there exists $x \in G$ whose order is $p$ . We also know by Langrage’s Theorem for finite groups that given a subgroup $H \leq G$ we have the order of $H$ dividing the order of $G$ .

Putting one and two together we can say that the ‘most simple’ subgroup of our finite group $G$ is the subgroup $\langle x \rangle$ or a cyclic group $\mathbb{Z}_p$ . As our goal (sorry I didn’t tell you this earlier) is to look at the subgroups of $G$ it seems clear that the next ‘most simple’ subgroups will have order some power of our prime, i.e $p^k$ where $1 \leq k \leq n$ .

By ‘simple’ I mean in some sense we are looking for subgroups of $G$ whose existence and size is only determined by the knowledge of knowing our prime in question $p$ divides $G$ .

Enough philosophy, let us define some notions. A $p$ -group is a finite group whose order is a power of a prime $p$ . $H$ is a $p$ -subgroup of a group $G$ if it is a subgroup and a $p$ -group ( $p$ necessarily has to divide the order of $G$ ). What is the largest $p$ -subgroup of $G$ ? Well, Langrange gives us some constraints and motivates the definition of a $p$ -Sylow subgroup. $H$ is a $p$ -Sylow subgroup if it is a subgroup of $G$ of order $p^n$ and $p^n$ is the highest power of $p$ dividing the order of $G$ .

So I hope I’ve described some motives for looking at/for $p$ -Sylow subgroups but can we even guarantee the existence of such a subgroup of our finite group $G$ ? Also, would it count as a surprise if I said the answer was yes?

To prove such a result we will use induction on the order of $G$ . If the order of $G$ is prime the result follows from Cauchy’s Theorem. Now suppose we can always find a $p$ -Sylow subgroup in every finite group whose order is divisible by $p$ and is strictly less than that of $G$ .

Langrage tells us for a subgroup $H \leq G$ we have $|G| = |H|[G:H]$ so if $H$ is proper and $p$ doesnt divide $[G:H]$ we can apply the inductive hypothesis to $H$ and the $p$ -Sylow subgroup we find for $H$ will be one we are looking for $G$ .

We rule this case out and suppose that for all proper subgroups $H$ of $G$ the index $[G:H]$ is divisible by $p$ . Let $G$ act on $G$ by conjugation, the action for this is $g * h := g^{-1}hg$ where $g, h \in G$ . Applying the orbit stabiliser theorem tells us for each orbit $G * x$ we have $|G * x| = [G:G_x]$ where $x \in G$ and $G_x$ is its stabiliser. Orbits partition $G$ and unwareling the meaning of orbits in this action gives us $|G| = |Z(G)| + \sum [G:G_x]$ where $Z(G)$ is the abelian subgroup $Z(G) = \{g \in G : hgh^{-1} =g \mbox{ for all } h \in G \}$ of G which is often called the center of $G$ . I hope by now when reading these posts you have pen and paper as this is a moment when you should verify what I am claiming. By the case we ruled out we know that $p$ divides $|Z(G)|$ and applying Cauchy’s Theorem we can find an element $x \in Z(G)$ of order $p$ and as $\langle x \rangle \trianglelefteq Z(G) \trianglelefteq G$ we have $\langle x \rangle \trianglelefteq G$ .

This gives us a quotient group $G / \langle x \rangle$ and a quotient map $\phi : G \longrightarrow G / \langle x \rangle$ and applying the inductive hypothesis to $G / \langle x \rangle$ gives us a $p$ -Sylow subgroup of $G / \langle x \rangle$ whose order is $p^{n-1}$ due to the equality $|G| = |\langle x \rangle||G/ \langle x \rangle| = p|G/\langle x \rangle|$ from Lagrange. Looking at $K = \phi^{-1}(K')$ as $\langle x \rangle \subset K$ and $\phi$ maps $K$ into $K'$ we have $K' \cong K/H$ and applying Lagrange (again) we obtain $|K|= |H||K'| = p^n$ , as desired. We have found our $p$ -Sylow subgroup $K$ of $G$ and this finishes the proof.

So we have existence but what more can we ask for and look for? How many such subgroups can we find? Also, if we have a standard $p$ -subgroup is it always contained in a $p$ -Sylow subgroup? How hard is it to find such subgroups?

If you don’t know the answer to these it is instructive to try to have a think about how you would go about answering these and try finding some $p$ -Sylow subgroups for your favourite groups. As a hint conjugation is very important.

Unusual metrics on the rationals

August 11, 2017August 11, 2017 veduchaLeave a comment

Suppose one day you get bored of the triangle inequality. You have done so much real analysis that you cannot see anything new and interesting in $|a-b| \le |a-z| + |z+b|$ anymore. But you want to keep doing analysis.

What can we do about this? One solution may be making this inequality more strict, and see where it takes us. For example, you may consider the following: $|a-b| \le \max{\{|a-z|,|z-b|\}}$ . Note that this inequality does not hold under our usual absolute value, for example: $5 = |8 - 3| \not \le \max{ \{ |8-4|, |4-3| \} } = \max{ \{4,1\} } = 4$ . But does this tweaking of the triangle inequality add anything new? It may not look like it at first, but it is a dream come true: a sequence is Cauchy if and only if distance of consecutive terms goes to zero! More mathematically: the sequence $(a_n)_{n \in \mathbb{N}}$ is Cauchy $\iff$ $|a_n-a_{n+1}| \to 0$ as $n \to \infty$

This is not true in the case of our usual absolute value. For example, the infamous sequence $H_n = \sum_{k = 1}^{n} \frac{1}{k}$ is not Cauchy in the reals $\mathbb{R}$ , for if it were Cauchy, it would converge since the reals numbers are complete. However, the difference between consecutive terms is $|1/n|$ which clearly tends to zero.

Before we dive into the proof of the previous claim, we need to note something: we are talking about convergence in $\mathbb{Q}$ however this would work in any metric space $(X,d)$ where the triangle inequality satisfies $d(a,b) \le \max{\{d(a,z), d(z,b) \}}$ for any $a,b,z \in X$ . This is because an absolute value on $\mathbb{Q}$ induces a metric by setting $d(a,b) = |a-b|$ .

Theorem: Let $| \cdot |$ be an absolute value on $\mathbb{Q}$ satisfying $|a+b| \le \max{ \{ |a|,|b|\} }$ . Then any sequence $(a_n)_{n \in \mathbb{N}}$ of rational numbers is Cauchy if and only if $|a_n-a_{n+1}| \to 0$ as $n \to \infty$ .

Proof: $( \implies)$ Let $\epsilon > 0$ . Since $(a_n)_{n \in \mathbb{N}}$ is Cauchy there exists $N \in \mathbb{N}$ such that for all $n,m > N$ we have $|a_n - a_m| < \epsilon$ . In particular, whenever $m = n+1$ , we obtain $|a_n - a_{n+1}| < \epsilon$ , which means that $|a_n - a_{n+1}| \to 0$ as $n \to \infty$ .

$( \impliedby )$ Let $\epsilon > 0$ . There exists $N \in \mathbb{N}$ such that $|a_n - a_{n+1}| < \epsilon$ for any $n > N$ .
Then, for all $m \ge n > N$ , applying our strict triangle inequality:

$\begin{aligned} |a_n - a_m| &= |a_n - a_{n+1} + a_{n+1} - \dots + a_{m-1} -a_{m}| \\ &\le \max{\{|a_n - a_{n+1}|, \dots, |a_{m-1} -a_{m}| \} } < \epsilon \end{aligned}$ .

Therefore the sequence is Cauchy. $\square$

This is all well and good, however, it makes us wonder whether there is any absolute value on $\mathbb{Q}$ with this strict triangle inequality property. The answer is yes, in fact, there are infinitely many of them, but this is the topic for another post. Until then, to satisfy your curiosity I will write down one such absolute value and I will leave it up to you to check that it is indeed an absolute value:

$|x|_3 = \frac{1}{3^{v(x)}}$
where $v(x) = n$ is the exponent of $3$ whenever we write the rational number as $x = 3^n \frac{p}{q}$ , where $p,q$ are coprime to $3$ .

Can you suggest a different absolute value?

When is the zero polynomial the only zero function?

June 17, 2017September 2, 2017 YatimaLeave a comment

Like all mathematics, the first thing to do is set the landsape in which you wish to work in. Let $K$ be a field and let $K[x_1,...,x_n]$ be the commutative polynomial ring with $n$ variables (if you are familiar with this ring I would skip the next paragraph)

What do the elements of $K[x_1,...,x_n]$ look like? Taking $n =3$ it is not hard to come up with elements, for example $x_1x_3 +{x_2}^4 + {x_2}^3{x_3}^2$ . One can quite easily start conjuring up polynomials for a given $n$ but a more interesting approach to answer the question, can we find a ‘generating’ set for our polynomials? This brings one to the notion of a moninomial which is simply a product of the form $x_1^{\alpha_1} ... x_n^{\alpha_n}$ . We can simplify the notation by starting with an $n$ -tuple $\alpha = (\alpha_1,..., \alpha_n)$ of non-negative integers and letting $X^\alpha = x_1^{\alpha_1} ... x_n^{\alpha_n}$ . Clearly a poynomial $f \in K[x_1,...,x_n]$ can be written as a finite linear combination of moninomials with the general form $f = \sum_{\alpha \in A} C_\alpha X^\alpha$ where $C_\alpha \in K$ and $A \subset {\mathbb{N}_{0}}^n$ with $A$ also being finite. Without knowing what a polynomial is aprior this is the method used to define what a polynomial is with $n$ variables.

$K[x_1,...,x_n]$ is a ring with addition and multiplicative defined how you would expect and as $K$ is a field the ring is commutative. Given $f \in K[x_1,...,x_n]$ we can think of it as an algebraic element of our ring or a function $f : K^n \longrightarrow K$ by swapping $x_i$ with some $a_i \in K$ for all $i$ . This gives us two notions of zero; the zero element of $K[x_1,...,x_n]$ and a zero function from $K^n \longrightarrow K$ .

To emphasise the difference let us take $K = \mathbb{F}_2$ , $n =1$ and look at the polynomials $f = 0$ and $g = x(x-1)$ in $K[x]$ . As algebraic elements clearly $f$ is the zero element and $g$ isn’t but as functions $g(0) = 0$ and $g(1) = 0$ so we can conclude both $f$ and $g$ are zero functions. To wrap up we have two distinct elements that are both zero functions $K^n \longrightarrow K$ .

This invokes a big question and one in which we will answer: When is the zero polynomial the only zero function? (I didn’t lead you with false pretenses by the title). The answer depends on the cardinality of $K$ .

If $K$ is finite of size $q$ by Lagrange’s Theorem and looking at the multiplicative group $K\backslash \{0\}$ one can show $x^q = x$ for all $x \in K$ so the polynomial $f = x^q- x$ is a zero function but is distinct from the zero element in $K[x_1,...,x_n]$ . If $n > 1$ we can define $f = x_1^q - x_1$ . This tells us that when $K$ is finite we can always come up with elements in $K[x_1,...,x_n]$ that are non-zero and are a zero function $K^n \longrightarrow K$ so there is no unique zero function.

What about the case when $K$ is infinite? Well…

Theorem: Let $K$ be an infinite field, let $f \in K[x_1,...,x_n]$ . Then, $f = 0$ if and only if $f : K^n \longrightarrow K$ is the zero function.

Proof of Theorem: Let us begin with the ‘easy’ direction, suppose $f = 0 \in K[x_1,...,x_n]$ trivially we must have $f(\alpha) = 0$ for all $\alpha \in K^n$ . The other direction requires us to show that if for some $f \in K[x_1,...,x_n]$ such that $f(\alpha) = 0$ for all $\alpha \in K^n$ then $f$ is the zero element in $K[x_1,...,x_n]$ .

We will use induction on the number of variabels $n$ . When $n = 1$ a non-zero polynomial in $K[x]$ of degree $m$ has at most $m$ roots. Suppose $f \in K[x]$ such that $f(\alpha) = 0$ for all $\alpha \in K$ and since $K$ is infinite $f$ must have infintely many roots and as the degree of $f$ must be finite we must conclude that $f = 0$ .

Now assuming the inductive hypothesis for $n - 1$ let $f \in K[x_1,...,x_n]$ be a polynomial such that $f(\alpha) = 0$ for all $\alpha \in K^n$ . By collect the various powers of $x_n$ we can write $f$ in the form of $f = \sum_{i}^{N} g_i(x_1,...,x_{n-1})x_n^i$ where $N$ is the highest power of $x_n$ that appears in $f$ and $g_i \in K[x_1,...,x_{n-1} ]$ . If we show that $g_i$ is the zero polynomial in $n - 1$ variables this will force $f = 0$ in $K[x_1,...,x_n]$ .

If we fix $(\alpha_1,...,\alpha_{n-1}) \in K^{n-1}$ we get a polynomial $f(\alpha_1,...,\alpha_{n-1}, x_n) \in K[x_n]$ and by our hypothesis this polynomial vanishes for all $\alpha_n \in K$ . Using our above representation of $f$ we see the coefficients of $f(\alpha_1,...,\alpha_{n-1}, x_n)$ are $g_i ( x_1,...,x_{n-1} )$ and hence $g_i(\alpha_1,...,\alpha_{n-1}) = 0$ for all $i$ . As $(\alpha_1,...,\alpha_{n-1}) \in K^{n-1}$ is chosen arbitraty $g_i$ is the zero function $K^{n-1} \longrightarrow K$ and so by the inductive hypothesis $g_i$ is the zero polynomial in $K[x_1,..., x_n]$ .

Bringing in both of these results lets us conclude that if a polynomial zero function $K^n \longrightarrow K$ is the zero element of $K[x_1,...,x_n]$ then we must have that the field $K$ is infinite. I find this quite a remarkable result in the fact that knowledge about a zero function gives you an indication on the cardinality of $K$ .

The theorem we have just proved also lets us conclude that when $K$ is infinite polynomials are unique functions or more precisely for $f ,g \in K[x_1,...,x_n]$ $f = g$ in $K[x_1,...,x_n]$ if and only if $f: K^n \longrightarrow K$ and $g : K^n \longrightarrow K$ are the same functions. I leave the proof of this for the reader and as a hint consider the polynomial $f - g$ .

An upper bound for roots of polynomials

May 4, 2017April 30, 2017 YatimaLeave a comment

A natural question to ask when given a polynomial is: how do the roots and the coefficients interplay with one another? In fact a huge portion of polynomial theory and solving algebraic equations is finding out the details of this relationship. The quadratic formula is a prime example – It gives a way of finding roots by only using the values of the coefficients. An explicit relationship, an algebraic formula, is only possible for polynomials of degree less than $5$ which is a famous result pioneered by the work of Galois.

The interplay we are going to determine for polynomials of any degree is that you only need to look for roots in a neighborhood around $0$ with the neighborhood’s size depending only on the size of the polynomial’s coefficients.

Let $p \in \mathbb{R}[x]$ be a polynomial over $\mathbb{R}$ of degree $n$ with the representation $p(x) = a_nx^n +... a_1x + a_0$ where $a_n \neq 0$ and that $|a_i| \leq A$ for all $i = 1,..., a$ and some $A \in \mathbb{R}$ . We will be considering our polynomial as a function $p : \mathbb{C} \longrightarrow \mathbb{C}$ and be setting $M \geq 0$ and $R = 1 + \frac{A + M}{|a_n|}$ where our choice of $R$ will be more apparent later.

What we are going to show is that for $x$ of a large enough magnitude our polynomial will have a magnitude as large as we like. We will also establish a simple bound on the magnitude of the roots of our polynomial $p$ .

Suppose $|x| \geq R$ , we have $|a_n x^n| = \frac{A + M}{R- 1}|x|^n$ by the definition of our constants and estimating $|a_{n-1}x^{n-1} + .... + a_0|$ gives $| a_{n-1}x^{n-1} + .... + a_0 | \leq A (|x^{n-1}| + .... + |x| + 1 ) = A\frac{|x|^n - 1}{|x| - 1 } < A \frac{|x|^n}{R-1}$ by using the sum of a geometric series and the triangle inequality.

Using the triangle inequality allows us to conclude $|a_nx^n| - |a_{n-1}x^{n-1} + ... + a_0 | \leq |p(x)|$ which gives us by using the previous estimation $|p(x)| \geq \frac{A+ M}{R-1} |x|^n - \frac{A}{R-1} |x|^n = \frac{M|x|^n}{R-1} \geq \frac{MR^n}{R-1} > M$ as both $|x| \leq R$ and $R^n > R > R - 1$ as $R > 1$ .

Bringing this together we have shown for all $M \geq 0$ if we look at $|x| \geq R$ where $R = 1 + \frac{A + M}{|a_n|}$ we have $|p(x)| > M$ .

Looking at the contrapositive of this statement lets us conclude if $|p(x)| \leq M$ for some $x \in \mathbb{C}$ and $M \geq 0$ we must have $|x| < R$ or equivalently $x$ living in the ball $B(0;R)$ . If we take $x$ to be a root of $p$ then by setting $M = 0$ we have that $|x| < R = 1 +\frac{A}{|a_n|}$ giving us a bound on the size of our root in terms of the polynomials coefficients. This means all roots of $p$ will be found in the ball $B(0; a + \frac{A}{|a_n|} )$ .