Skip to content
Permalink
f3d8854f45
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
455 lines (385 sloc) 31.5 KB
\chapter{Methodology}
Knowledge about how students conceptualize has a qualitative nature. For
qualitative research, methodology varies, but has standard parts: design of the
study, sources and their selection, data, the process of analysis, the interpretation,
and the approach to validation. Sample selection is recorded and reported
so that others may judge transferability to their own context. Interviews are
the principle technique used by phenomenographical research. Documents
can also be used. Normal conduct of teaching can also provide data that can
be used, if in an anonymous, aggregate form. Both deductive and inductive
analysis provide qualitative data.
%\chapter{Design of the Study}
This work is a qualitative study, the underlying philosophy is constructivist,
the research perspective is phenomenography, as extended to variation theory,
and the epistemological framework is a layered collection of intellectual disciplines.
At the highest level of integration, computer science and mathematics
reside, supported by studies in memory and attention, including computational
complexity applied to cognitive neuroscience, and neurophysiology.
This study is qualitative because
we seek to be able to describe the nature of the various
understandings achieved by the students, rather than the relative frequency
with which any particular understanding is obtained.
% the focus is on determining what questions
%would be posed, in the process of continuous curriculum adaptation and improvement
%the meaning students are making of their specific educational experiences.
\section{Population Studied}
In a phenomenographic study, it is desirable to sample widely to obtain as broad as possible a view of the multiple ways of experiencing a phenomenon within the population of interest.\cite{marton1997learning} which in turn cites Glaser and Strauss 1967\cite{glaser1968discovery} We studied, by interview, homework and test, undergraduate students who have taken courses involving proof. Typically but not always, these are students majoring in computer science. Some of these undergraduate students are dual majors, in computer science and mathematics. We interviewed graduate students emphasizing those who have been teaching assistants for courses involving proofs. We interviewed faculty who have been taught courses involving proof. We have interviewed former students who have graduated from the department. We have interviewed undergraduates who transferred out of the department.
\section{Chronology of the Design}
The design of this study began while teaching Introduction to the Theory of
Computing. While helping students learn the pumping lemma for regular
languages, and trying to understand from where the several difficulties arose,
I became curious about the bases of these difficulties. One example was that
a student felt strongly that a variable, a letter, denoting repetitions in a mathematical
formulation, could only stand for a single numeric value, rather than a
domain. Subsequently I have learned that symbolization is a category identified
by Harel and Sowder \cite{harel1998students}, for students of mathematics learning proofs. Our
student is a vignette of our computer science student population harboring
some of the same conceptualizations. As a consequence of this opinion, the
student felt that showing that a mathematical formulation had a true value
was equivalent to demonstrating a true value for a single example, rather than
demonstrating a true value for a domain. Here we see evidence for the category
Harel and Sowder \cite{harel1998students} call (is it inductive, perceptual?) where an example
is thought to provide proof of a universal statement. Later, while helping students
study the relationship between context free grammars and pushdown
automata, I learned from the students that many of them did not find inductive
proofs convincing. Subsequently I have learned that Harel and Sowder \cite{harel1998students} created
a category called axiomatic reasoning. In axiomatic reasoning, students
begin with accepted information, such as axioms and premises, and apply rules
of inference to deduce the desired goal. This category had not always been
reached by their students, similarly to ours. As will be seen, later interview data showed,
some of our students learn to produce the artifact of a proof by mathematical
induction by procedure. They learn the parts, and they supply the parts
when asked, but are not themselves convinced. (McGowan and Tall report a similar situation.) This matches with two other
categories created by Harel and Sowder \cite{harel1998students}, internalization and interiorization.
Still later, when leading a course on ethical reasoning for issues related to computer
science, I found that most of the students did not notice that methods
of valid deductive argumentation were tools that they might apply to defend
their opinions.
Thus the idea of exploring the nature of the students' degrees of preparation
for understanding and creating proofs appeared.
First, interviews about proofs in general were conducted, with a broad interview
script.
%The students almost all selected proofs by mathematical induction.
During analysis of these data, a more elaborate interview script was developed,
aiming at the ideas of domain, range, relation, mapping, function, the ideas of
variable, as in programs and mathematical formulations, and abstraction.
Some students emphasized that mathematical definitions are analogous to
definitions in natural languages, and that mathematical discourse is carried
out in the mathematical language created by these definitions.
The capabilities for expression and care bestowed by these definitions invest
mathematical reasoning with its persuasive power.
Thus both the reasoning processes, using concepts and the clearly defined
mathematical concepts together provide the ability of mathematical argumentation
to be convincing. Students who appreciated this found it invigorating.
Other students had different reactions to definitions. Thus, the role of definitions
and language became another area of exploration.
The difference between a domain and a single point in a domain can be seen as
a level of abstraction. If something is true for a single point in a domain, but is
also true for every single point in the domain, then the point can be seen as a
generic particular point, representative of the domain. This concept of ability
to represent is related to the idea of abstraction.
We saw data in this study that affirmed the observations of others, that students
do not always easily recognize the possibility of abstraction.
\section{Parts of the Study}
The study was devoted to proofs, a subject that can be subdivided.
Part of the study was aimed at the idea of domain, directed at the concept that
though a variable could identify a scalar, it might also represent a set.
Part of the study was aimed at the activity of abstraction, because some students
exhibited the ability to operate at one level of abstraction, not necessarily a
concrete level, yet the ability to traverse between that level of abstraction and
a concrete level seemed to be absent. Other students claimed to be able to
understand concrete examples with ease, but to encounter difficulty when
short variable names were used within the same logical argument.
%\subsection{Order of Exploration}
%The order of exploration was data driven, thus the material was sought sometimes
%in reverse order of the curriculum, almost as if seeking bedrock by starting
%at a surface, and working downwards.
\section{Design of the Study}
We
conducted over 30 interviews.
Our interview participants included undergraduate and graduate students of (\textbf{how do we want to say this}). Most of the graduate student interviews were among teaching assistants in courses that taught and/or used proofs. We also included faculty of courses that involved proofs.
Information learned in tutoring and lecturing inspired the research questions.
We used exams to study errors in application of the pumping lemma for regular
languages. We used early interviews to explore proof, adapting to the
student preference for proof by mathematical induction and incorporating the use of recursive
algorithms. We used homework
to observe student attempts at proofs.
We used later interviews to investigate the remaining questions mentioned earlier.
\textbf{Need to make this true: We used homework to observe student familiarity/facility
with different (specific) proof techniques: induction, construction, contradiction,
and what students think it takes to make an argument valid.}
\section{Sample Selection}
\textbf{Need even more detail than what's here. It could alternatively be put in subsections.}
Students from the University of Connecticut who have taken or are taking the
relevant courses were offered the opportunity to be interviewed. The students
who volunteered were mostly male, mostly traditionally aged undergraduates,
though some graduate students also volunteered. Some students were
domestic, and some international. Some students were African-American,
some Asian, some Caucasian, some Latino/a, some with learning disabilities
such as being diagnosed as on the autistic spectrum.
%\subsection{Proofs Using the Pumping Lemma for Regular Languages}
The participants for the study of proofs using the pumping lemma for regular languages were
%In a recent course offering to
forty-two students, of whom thirty-four were men and eight women,
forty-one traditional aged,
%one former Marine somewhat older, one collegiate athlete (a
%woman),
there were three students having Latin-heritage surnames, 1/4 of the
students had Asian heritage, 2 had African heritage, and 8 were international
students. Each student individually took the final exam. A choice among
five questions was part of the final exam; one required applying the pumping
lemma. Half the students (21/42) selected this problem. These were 17 men
and 4 women. Three quarters of those (15/42) selecting the pumping lemma
got it wrong. These students, who chose the pumping lemma problem and
subsequently erred on it, form the population of our study.
%\subsection{Proofs by Mathematic Induction}
The participants for the study of proof by mathematic induction
%We studied students who
were taking, or
% who
had recently taken, a course
on Discrete Systems required of all computer science, and computer science and
engineering students.
Volunteers were solicited from all students attending the Discrete Systems
courses.
Interviews of eleven students were transcribed for this study. Participants
included 2 women and 9 men. Two were international students, a third was a
recent immigrant.
%\subsection{Domain, Range, Mapping, Relation, Function, Equivalence in Proofs}
For the study about domain, range, mapping, relation, function and equivalence in proofs, students
%Students
taking, or having taken, discrete systems, especially students who
had sought help while taking introductory object oriented programming volunteered
to be interviewed.
\section{Data Collection}
Our corpus included interview transcripts, homework, practice and real tests,
observations from individual tutoring sessions, and group help sessions. Interview
transcripts were analyzed with thematic analysis. Homework, practice
and real tests were analyzed for proof attempts. Data from individual tutoring
sessions and group help sessions was also informative. Aggregate, anonymous
data was used.
\subsection{Interviews}
\subsection{Documents}
\subsubsection{Proofs Using the Pumping Lemma for Regular Languages}
The study was carried out on the exam documents. The interpretation was informed
by remembering events that occurred in the natural conduct of lectures,
help sessions and tutoring.
One method of assessing whether students understood the ease of application
of the pumping lemma to a language to be proved not regular was offering a
choice between using the Myhill-Nerode theorem with a strong hint or using
the pumping lemma. The pumping lemma problem, which could very easily
have been solved by application of the Myhill-Nerode theorem, especially with
the supplied hint, was designed, when tackled with the pumping lemma, to
require, for each possible segmentation, a different value of $i$ (the number of
repetitions) that would create a string outside of the language. The intent was
to separate students who understood the meaning of the equation's symbols,
and the equation itself, from those students engaged in a manipulation with at
most superficial understanding.
\subsubsection{Proofs by Mathematic Induction}
Interviews were solicited in class by general announcement, and by email.
Interviews were conducted in person, using a voice recorder. No further
interview script, beyond these following few questions, was used. The interviews
began with a general invitation to discuss students' experience with and
thoughts on proofs from any time, such as high school, generally starting with
\begin{itemize}
\item ``Tell me anything that comes to your mind on the subject of using proofs,
creating proofs, things like that.''
\end{itemize}
and then following up with appropriate questions to get the students to elaborate
on their answers.
Additional questions from the script that were used when appropriate included
\begin{itemize}
\item ``Why do you think proofs are included in the computer science curriculum?'',
\item ``Do you like creating proofs?''
\end{itemize}
and, after proof by induction was discussed,
\begin{itemize}
\item “Do you see any relation between proof by induction and recursive algorithms?”.
\end{itemize}
Almost every student introduced and described proof by mathematic induction as experienced
in their current or recent class.
\section{Expanded semi-structured interview protocol for domain, range, language, equivalence class in Proofs}
\section{Expanded semi-structured interview protocol for definitions, language, reasoning in Proofs}
\section{Analysis}
Describing how analysis was done in detail is really important.
How do you do phenomenography?
Is this the way everything was analyzed?
Marton and Booth\cite[p. 133]{marton1997learning} describe a desirable analysis technique:
Apply the principle of focusing on one aspect of the object and seeking its dimension of variation while holding other aspects frozen. %partial derivative
Remember to apply both perspectives, that pertaining to the individual and that pertaining to the collective.
Establish a perspective with boundaries, within which looking for variation.
\begin{enumerate}
\item seartch for extracts from data, that might pertain to perspective
\item inspect them in context of own interview
\item inspect them in context of other extracts all interviews on the same theme
\end{enumerate}
\begin{enumerate}
\item select one aspect of the phenomenon and inspect across all subjects
\item select another aspect
\item whole interview -- to see where these two aspects lie relative to other aspects, and to background
\end{enumerate}
\begin{enumerate}
\item all of research problems, one problem at a time, whole transcripts that have particularly interesting ways of handling problem
\end{enumerate}
Keep going, clarity will emerge.
Result: identify a number of qualitatively different ways in which phenomenon has been experienced (not forgetting different methods of expression)\cite[p. 133]{marton1997learning}.
Overlap of the material at the collective level is expected.
assume that what people say is logical from their point of view\cite[p. 134]{marton1997learning}, citing Smedlund\cite{smedslund1970circular}
Data were analyzed using a modified version of thematic analysis, which is
in turn a form of basic inductive analysis.\cite{Merriam2002,Merriam2009,braun2006using,fereday2008demonstrating,boyatzis1998transforming} Using thematic analysis, we
read texts, including transcripts, looked for “units of meaning”, and extracted
these phrases. Deductive categorization began with defined categories, and
sorted data into them. Inductive categorization “learned” the categories, in
the sense of machine learning, which is to say, the categories were determined
from the data, as features and relationships found among the data suggested
more and less closely related elements of the data. A check on the development
of categories compared the categories with the collection of units of meaning.
Each category was named by either an actual unit of meaning (obtained during
open coding) or a synonym (developed to capture the essence of the category).
A memo was written to capture the summary meaning of the category.
Next a process called axial coding, found in the literature on grounded theory,
\cite{strauss1990basics,kendall1999axial,glaser2008conceptualization} was applied. This process considered each category in turn as a central
hub; attention focused on pairwise relations between that central category
with each of the others. The strength and character of the posited relationship
between each pair of categories was assessed. On the basis of the relationships
characterized in this exercise, the categories with the strongest interesting relationships
were promoted to main themes. A diagram showing the main
themes and their relationships, qualified by the other, subsidiary themes and
the relationships between the subsidiary and main themes was prepared to
present the findings. Using the process of constant comparison, the structure
of these relationships was reviewed in the light of the meanings of the categories.
A memo was written about each relationship in the diagram, referring
to the meaning of the categories and declaring the meaning of the relationship.
A narrative was written to capture the content of the diagram. Using the
process of constant comparison, the narrative was reviewed to see whether it
captured the sense of the diagram. Units of meaning were compared with the
narrative and their original context, to see whether the narrative seemed to
capture the meaning. The products of the analysis were the narrative and the
diagram.
%\chapter{Analysis}
The product of analysis in a phenomenographic study, is a set of categories, and relationships \textit{among} them.
Marton and Booth\cite[p. 135]{marton1997learning} state "in the late stages of analysis, our researcher [has] a sharply structured object of research, with clearly related faces, rich in meaning. She is able to bring into focus now one aspect, now another; she is able to see how they fit together like pieces of a multidimensional jigsaw puzzle; she is able to turn it around and see it against the background of the different situations that it now transcends."
Using Marton's overriding categorization of task and objective, we can consider that some students do not know, at least when they are studying CSE2500, that they need to be able to understand some proofs, to be good developers. Therefore, they can logically approach the study of proof in CSE2500 as a task, having some facts that they must memorize. Other students, including those with dual majors in math, wish to improve themselves by improving their ability to couch arguments in mathematical terms and both ascertain facts for themselves and convince others. Students who are aware that there are computer-science related purposes for proof, for example, in the study of algorithms they will be using proofs to understand resource consumption, will recognize the study as having the objective to improve themselves vis-a-vis dealing with proof.
Because the relationships are expected to form a partial order, corresponding to set inclusion of subsets of the complete (with respect to the objective of teaching) understanding of the information being taught, we can say relationships \textit{between} categories.
The set inclusion relationship can be that a deeper understanding includes a more superficial understanding.
It may also be that a deeper understanding qualifies a more superficial understanding, such as being applicable in a restricted domain. Thus, understanding of a liquid as being divisible to any degree can be qualified as to scale such as macroscopic, microscopic and so on.
In a phenomenographic study, this partial order is referred to as a hierarchical order.
The objective of teaching, as will be in some parts of this study, the components of proof, may have many parts, called, in phenomenology, internal structure. The granularity of the subdivision of the objective of teaching results in the number of parts in the internal structure. If we let $n$ denote the number of parts obtained with a specific granularity, we see that the number of subsets will be $2^n$, which will be inconveniently large unless the granularity is sufficiently coarse. Thus, we choose a granularity resulting in approximately 4 elements of internal structure.
Thus, when the teaching objective is, what is a proof, we may limit the granularity, such that the internal structure of a proof is, for example,
\begin{enumerate}
\item that particular statement which is to be proved
\item axioms, premises, suppositions, cases
\item other statements
\item warrants (rules of inference)
\end{enumerate}
We might choose to pursue finer granularity in some cases, for example, we might pursue "What is a statement?", because instructors have found that not all students arriving in CSE3502 have the same depth of understanding of statement, and some do not have sufficiently deep understanding of "statement" to be able to comprehend a proof.
Marton and Booth\cite[p. 22]{marton1997learning} call our attention to learners directing their attention to the sign vs. to the signified. With proofs, Polya \cite{} has mentioned a procedural approach to executing a proof, without understanding, as have Harel and Sowder \cite{harel1998students}, and Tall\cite{tall2001symbols}. Weber and Alcock\cite{weber2004semantic} have observed and described students omitting understanding of warrants in proofs. In each of these cases, the sign is provided, but the signified is at best incompletely understood.
Analysis can usefully illuminate learning processes, taking note of the temporal domain\cite{marton1997learning}. This has been used by Booth in her analysis of how students understand the process of programming\cite{marton1997learning}.
[p. 136]\cite{marton1997learning} important to be looking whether conceptualizations appear in a certain case, in a certain period of time (such as, when see proofs again in 3500, 3502, are they recognized as proofs, some no some yes, are they helpful as proofs, or troublesome, some helpful, some not, "never did get that")
\section{Analysis of Interviews}
Data were analyzed using a modified version of thematic analysis, which is
in turn a form of basic inductive analysis.\cite{Merriam2002,Merriam2009,braun2006using,fereday2008demonstrating,boyatzis1998transforming} Using thematic analysis, we
read texts, including transcripts, looked for “units of meaning”, and extracted
these phrases. Deductive categorization began with defined categories, and
sorted data into them. Inductive categorization “learned” the categories, in
the sense of machine learning, which is to say, the categories were determined
from the data, as features and relationships found among the data suggested
more and less closely related elements of the data. A check on the development
of categories compared the categories with the collection of units of meaning.
Each category was named by either an actual unit of meaning (obtained during
open coding) or a synonym (developed to capture the essence of the category).
A memo was written to capture the summary meaning of the category.
Next a process called axial coding, found in the literature on grounded theory,
\cite{strauss1990basics,kendall1999axial,glaser2008conceptualization} was applied. This process considered each category in turn as a central
hub; attention focused on pairwise relations between that central category
with each of the others. The strength and character of the posited relationship
between each pair of categories was assessed. On the basis of the relationships
characterized in this exercise, the categories with the strongest interesting relationships
were promoted to main themes. A diagram showing the main
themes and their relationships, qualified by the other, subsidiary themes and
the relationships between the subsidiary and main themes was prepared to
present the findings. Using the process of constant comparison, the structure
of these relationships was reviewed in the light of the meanings of the categories.
A memo was written about each relationship in the diagram, referring
to the meaning of the categories and declaring the meaning of the relationship.
A narrative was written to capture the content of the diagram. Using the
process of constant comparison, the narrative was reviewed to see whether it
captured the sense of the diagram. Units of meaning were compared with the
narrative and their original context, to see whether the narrative seemed to
capture the meaning. The products of the analysis were the narrative and the
diagram.
\section{Interview}
Some students remembered taking proofs in high school in geometry.
Some students were taking proofs contemporaneously in philosophy.
Some of the students studying proof in philosophy found them disturbing, expressing a preference for geometrical proofs.
Some students remembered having to furnish proofs of geometrical facts, also facts about prime numbers and sets.
Some students knew that CSE2500 treated proofs because they would be used in later courses. Students did not know why proofs would be used later, and were generally happy to hear some example uses.
Though students were asked whether they made use of proofs spontaneously, none of those interviewed gave an example.
Some students preferred to articulate with code, and some (who were dual computer science / math) sometimes preferred mathematical symbols, depending upon the context.
Some students do wish to convince themselves of things, such as tractable execution times, and correctness. Though students were asked whether they made use of proofs for this purpose, none of those interviewed claimed to do so, rather they mentioned going carefully over their algorithm construction, and considering cases.
In interviews, the students almost all chose to discuss proofs by mathematical induction.
\subsection{Themes / Categories}
\begin{itemize}
\item Definitions\\
Students divided into (1)those who found definitions boring, difficult to pay attention to, and undesirable compared to examples, from which they preferred to induce their own definitions, and (2) those who had caught on to the idea that definitions were the carefully crafted building blocks of reasoning.
\item Procedures
Students sometimes learned what was desired in a proof, but learned to produce it by procedure, and were not themselves convinced.
\item Context
Students asked whether the topics for examples and exercises, such as prime numbers, had relevance to programming, with which they had experience, but not unrelated to the topics.
Students did not know the context in which the proofs, or procedure version of proof, was applicable, so, for example, did not apply proof by mathematic induction to recursive algorithms, and did not know how to tell whether recursive algorithms would be applicable.
\item Concrete vs. Abstract
Some students felt quite comfortable with the application of rules of inference to concrete items, but had difficulty transferring application of those rules to mathematical symbols.
\item Symbolization
consistent with Harel and Sowder's 1998 categorization of concepts, we found students who would attempt to write in symbols, but not understand what was denoted, and consequently were uncertain about appropriate operations. Some of these students were glad to see a progression from pseudocode with long variable names to pseudocode with short variable names to mathematical symbolization (formula translation (FORTRAN) in reverse).
\item Applicability of single examples
Some students believed that a few examples constituted a proof. These examples were not generic particular, nor were they transformational, in the sense of Harel and Sowder's 1998 model.
\item Substructure
Students familiar with methods, in the sense of object-oriented programming, and with construction of programs involving multiple method calls, did not always recognize that proofs could be built from multiple lemmas, although they did understand that axioms could be applied.
\item Proofs are used, in computer science, to show resource consumption (complexity class), properties of models of computation, and computability/decidability. No occasion was identified, other than assignment, when students recognized they were undertaking proofs.
\end{itemize}
\subsection{Relationships}
\section{Analysis of Homework and Tests}
\subsection{Proofs}
Proofs submitted on homework and tests were analyzed in several respects.
The overall approach should be valid. For example, students who undertook to prove that the converse was true did not use a valid approach.
The individual statements should each be warranted.
Use of structure, such as lemmas, and care that cases form a partition of the relevant set are gladly noticed.
Proof attempts that lose track of the goal, and proof attempts that assert with insufficient justification, the goal are noted.
\subsection{Pumping Lemmas}
We wrote descriptions for each error. Some example descriptions
are in Table II.
Table : Some example errors
Let x be empty
$|xy| \leq p, so xy = 0^p$\\
$|xy| \leq p; let \; x = 0^{p+r}, y = 0^{p+r}, 0 < r < p$\\
Let’s choose $|xy| = p$\\
$0^{p+1}0^b1^p \neq 0^{p+1}1^p \therefore xy^2z \not\in \mathcal{L}$
where $\mathcal{L} = \{0^i1^j, i \neq j\}$\\
we choose $s = 0^{p+1}1^p$ within $|xy|$\\
thus $\neq 0^p1^{p+1}$\\
Let $x = 0^a, y = 0^b1^a$\\
$x = 0^{p-h}, y = 0^h$\\
$x = 0^i, y = 0^i, z = 0^i1^j$
A handful of students did exhibit their reasoning that for
all segmentations there would exist at least one value of 𝑖 that
would generate a string outside the language.
We categorized the errors as misunderstandings of one or
more of:
1) ∣𝑥𝑦∣ ≤ 𝑝 permits ∣𝑥𝑦∣ < 𝑝\\
2) 𝑥 is the part of the string prior to the cycle\\
3) 𝑦 is the part of the string which returns the state of
the automaton to a previously visited state\\
4) 𝑧 is the part of the string after the (last) cycle up to
acceptance\\
5) 𝑝 − 1 characters is the maximum size of a string
that need not contain a cycle, (strings of length 𝑝
or greater must reuse a state)\\
6) 𝑖 is the number of executions of 𝑦\\
7) There must be no segmentation for which pumping
is possible, if pumping cannot occur.\\
8) A language is a set of strings.\\
9) A language class is a set of languages.\\
Categories are shown in the chapter on results (labelled table iii).\\
\section{Help Session and Tutoring}
some students, who do know that any statement must and can, be
either true or false, thought implications must be true.