Introduction to Computer Programming
A course on computer programming emphasizing the program design process and pragmatic programming skills. It will use the Python programming language and will not assume previous programming experience. Material covered will include data types, variables, assignment, control structures, functions, scoping, compound data, string processing, modules, basic input/output (terminal and file), as well as more advanced topics such as recursion, exception handling and object-oriented programming. Program development and maintenance skills including debugging, testing, and documentation will also be taught. Assignments will include problems drawn from fields such as graphics, numerics, networking, and games. At the end of the course, students will be ready to learn other programming languages in courses such as CS 11, and will also be ready to take more in-depth courses such as CS 2 and CS 4.
Intermediate Computer Programming
Students must be placed into this course via the CS placement test. An intermediate course on computer programming emphasizing the program design process and pragmatic programming skills. It will use the Java programming language and will assume previous programming experience such as an AP CS A course. Material will focus on more advanced topics such as recursion, exception handling and object-oriented programming. Program development and maintenance skills including debugging, testing, and documentation will also be taught. Assignments will include problems drawn from fields such as graphics, numerics, networking, and games. At the end of the course, students will be ready to learn other programming languages in courses such as CS 11, and will also be ready to take more in-depth courses such as CS 2 and CS 4
Introduction to Programming Methods
CS 2 is a demanding course in programming languages and computer science. Topics covered include data structures, including lists, trees, and graphs; implementation and performance analysis of fundamental algorithms; algorithm design principles, in particular recursion and dynamic programming; Heavy emphasis is placed on the use of compiled languages and development tools, including source control and debugging. The course includes weekly laboratory exercises and projects covering the lecture material and program design. The course is intended to establish a foundation for further work in many topics in the computer science option.
Introduction to Software Design
CS 3 is a practical introduction to designing large programs in a low-level language. Heavy emphasis is placed on documentation, testing, and software architecture. Students will work in teams in two 5-week long projects. In the first half of the course, teams will focus on testing and extensibility. In the second half of the course, teams will use POSIX APIs, as well as their own code from the first five weeks, to develop a large software deliverable. Software engineering topics covered include code reviews, testing and testability, code readability, API design, refactoring, and documentation.
Fundamentals of Computer Programming
This course gives students the conceptual background necessary to construct and analyze programs, which includes specifying computations, understanding evaluation models, and using major programming language constructs (functions and procedures, conditionals, recursion and looping, scoping and environments, compound data, side effects, higher-order functions and functional programming, and object-oriented programming). It emphasizes key issues that arise in programming and in computation in general, including time and space complexity, choice of data representation, and abstraction management. This course is intended for students with some programming background who want a deeper understanding of the conceptual issues involved in computer programming.
Introduction to Discrete Mathematics
First term: a survey emphasizing graph theory, algorithms, and applications of algebraic structures. Graphs: paths, trees, circuits, breadth-first and depth-first searches, colorings, matchings. Enumeration techniques; formal power series; combinatorial interpretations. Topics from coding and cryptography, including Hamming codes and RSA. Second term: directed graphs; networks; combinatorial optimization; linear programming. Permutation groups; counting nonisomorphic structures. Topics from extremal graph and set theory, and partially ordered sets. Third term: elements of computability theory and computational complexity. Discussion of the P=NP problem, syntax and semantics of propositional and first-order logic. Introduction to the Gödel completeness and incompleteness theorems.
Introduction to Computer Science Research
This course will introduce students to research areas in CS through weekly overview talks by Caltech faculty and aimed at first-year undergraduates. More senior students may wish to take the course to gain an understanding of the scope of research in computer science. Graded pass/fail.
Introduction to Digital Logic and Embedded Systems
This course is intended to give the student a basic understanding of the major hardware and software principles involved in the specification and design of embedded systems. The course will cover basic digital logic, programmable logic devices, CPU and embedded system architecture, and embedded systems programming principles (interfacing to hardware, events, user interfaces, and multi-tasking).
Computer Language Lab
A self-paced lab that provides students with extra practice and supervision in transferring their programming skills to a particular programming language. The course can be used for any language of the student's choosing, subject to approval by the instructor. A series of exercises guide students through the pragmatic use of the chosen language, building their familiarity, experience, and style. More advanced students may propose their own programming project as the target demonstration of their new language skills. This course is available for undergraduate students only. Graduate students should register for CS 111. CS 11 may be repeated for credit of up to a total of nine units.
Student-Taught Topics in Computing
Each section covers a topic in computing with associated sets or projects. Sections are designed and taught by an undergraduate student under the supervision of a CMS faculty member. CS 12 may be repeated for credit of up to a total of nine units.
Introduction to Computer Science in Industry
This course will introduce students to CS in industry through weekly overview talks by alums and engineers in industry. It is aimed at first and second year undergraduates. Others may wish to take the course to gain an understanding of the scope of computer science in industry. Additionally students will complete short weekly assignments aimed at preparing them for interactions with industry. Graded pass/fail.
Decidability and Tractability
This course introduces the formal foundations of computer science, the fundamental limits of computation, and the limits of efficient computation. Topics will include automata and Turing machines, decidability and undecidability, reductions between computational problems, and the theory of NP-completeness.
Data Structures & Parallelism
CS 22 is a demanding course that covers implementation, correctness, and analysis of data structures and some parallel algorithms. This course is intended for students who have already taken a data structures course at the level of CS 2. Topics include implementation and analysis of skip lists, trees, hashing, and heaps as well as various algorithms (including string matching, parallel sorting, parallel prefix). The course includes weekly written and programming assignments covering the lecture material.
Introduction to Computing Systems
Basic introduction to computer systems, including hardware-software interface, computer architecture, and operating systems. Course emphasizes computer system abstractions and the hardware and software techniques necessary to support them, including virtualization (e.g., memory, processing, communication), dynamic resource management, and common-case optimization, isolation, and naming.
This course introduces techniques for the design and analysis of efficient algorithms. Major design techniques (the greedy approach, divide and conquer, dynamic programming, linear programming) will be introduced through a variety of algebraic, graph, and optimization problems. Methods for identifying intractability (via NP-completeness) will be discussed.
Computer Science Education in K-14 Settings
This course will focus on computer science education in K-14 settings. Students will gain an understanding of the current state of computer science education within the United States, develop curricula targeted at students from diverse backgrounds, and gain hands on teaching experience. Through readings from educational psychology and neuropsychology, students will become familiar with various pedagogical methods and theories of learning, while applying these in practice as part of a teaching group partnered with a local school or community college. Each week students are expected to spend about 2 hours teaching, 2 hours developing curricula, and 2 hours on readings and individual exercises. Pass/Fail only. May not be repeated.
Multidisciplinary Systems Engineering
This course presents the fundamentals of modern multidisciplinary systems engineering in the context of a substantial design project. Students from a variety of disciplines will conceive, design, implement, and operate a system involving electrical, information, and mechanical engineering components. Specific tools will be provided for setting project goals and objectives, managing interfaces between component subsystems, working in design teams, and tracking progress against tasks. Students will be expected to apply knowledge from other courses at Caltech in designing and implementing specific subsystems. During the first two terms of the course, students will attend project meetings and learn some basic tools for project design, while taking courses in CS, EE, and ME that are related to the course project. During the third term, the entire team will build, document, and demonstrate the course design project, which will differ from year to year. First-year undergraduate students must receive permission from the lead instructor to enroll. Not offered 2022-23.
Individual research project, carried out under the supervision of a member of the computer science faculty (or other faculty as approved by the computer science undergraduate option representative). Projects must include significant design effort. Written report required. Open only to upperclass students. Not offered on a pass/fail basis.
Undergraduate Projects in Computer Science
Supervised research or development in computer science by undergraduates. The topic must be approved by the project supervisor, and a formal final report must be presented on completion of research. This course can (with approval) be used to satisfy the project requirement for the CS major. Graded pass/fail.
Undergraduate Reading in Computer Science
Supervised reading in computer science by undergraduates. The topic must be approved by the reading supervisor, and a formal final report must be presented on completion of the term. Graded pass/fail.
Special Topics in Computer Science
The topics covered vary from year to year, depending on the students and staff. Primarily for undergraduates.
Seminar in Computer Science
Instructor's permission required.
Reading in Computer Science
Instructor's permission required.
Causation and Explanation
An examination of theories of causation and explanation in philosophy and neighboring disciplines. Topics discussed may include probabilistic and counterfactual treatments of causation, the role of statistical evidence and experimentation in causal inference, and the deductive-nomological model of explanation. The treatment of these topics by important figures from the history of philosophy such as Aristotle, Descartes, and Hume may also be considered.
Graduate Programming Practicum
A self-paced lab that provides students with extra practice and supervision in transferring their programming skills to a particular programming language. The course can be used for any language of the student's choosing, subject to approval by the instructor. A series of exercises guide the student through the pragmatic use of the chosen language, building his or her familiarity, experience, and style. More advanced students may propose their own programming project as the target demonstration of their new language skills. This course is available for graduate students only. CS 111 may be repeated for credit of up to a total of nine units. Undergraduates should register for CS 11.
This course provides an introduction to Bayesian Statistics and its applications to data analysis in various fields. Topics include: discrete models, regression models, hierarchical models, model comparison, and MCMC methods. The course combines an introduction to basic theory with a hands-on emphasis on learning how to use these methods in practice so that students can apply them in their own work. Previous familiarity with frequentist statistics is useful but not required.
This course is a both a theoretical and practical introduction to functional programming, a paradigm which allows programmers to work at an extremely high level of abstraction while simultaneously avoiding large classes of bugs that plague more conventional imperative and object-oriented languages. The course will introduce and use the lazy functional language Haskell exclusively. Topics include: recursion, first-class functions, higher-order functions, algebraic data types, polymorphic types, function composition, point-free style, proving functions correct, lazy evaluation, pattern matching, lexical scoping, type classes, and modules. Some advanced topics such as monad transformers, parser combinators, dynamic typing, and existential types are also covered.
Reasoning about Program Correctness
This course presents the use of logic and formal reasoning to prove the correctness of sequential and concurrent programs. Topics in logic include propositional logic, basics of first-order logic, and the use of logic notations for specifying programs. The course presents a programming notation and its formal semantics, Hoare logic and its use in proving program correctness, predicate transformers and weakest preconditions, and fixed-point theory and its application to proofs of programs. Not offered 2022-23.
Various approaches to computability theory, e.g., Turing machines, recursive functions, Markov algorithms; proof of their equivalence. Church's thesis. Theory of computable functions and effectively enumerable sets. Decision problems. Undecidable problems: word problems for groups, solvability of Diophantine equations (Hilbert's 10th problem). Relations with mathematical logic and the Gödel incompleteness theorems. Decidable problems, from number theory, algebra, combinatorics, and logic. Complexity of decision procedures. Inherently complex problems of exponential and superexponential difficulty. Feasible (polynomial time) computations. Polynomial deterministic vs. nondeterministic algorithms, NP-complete problems and the P = NP question. Not offered 2022-23.
Automata-Theoretic Software Analysis
An introduction to the use of automata theory in the formal analysis of both concurrent and sequentially executing software systems. The course covers the use of logic model checking with linear temporal logic and interactive techniques for property-based static source code analysis. Not offered 2022-23.
Advanced Digital Systems Design
Advanced digital design as it applies to the design of systems using PLDs and ASICs (in particular, gate arrays and standard cells). The course covers both design and implementation details of various systems and logic device technologies. The emphasis is on the practical aspects of ASIC design, such as timing, testing, and fault grading. Topics include synchronous design, state machine design, ALU and CPU design, application-specific parallel computer design, design for testability, CPLDs, FPGAs, VHDL, standard cells, timing analysis, fault vectors, and fault grading. Students are expected to design and implement both systems discussed in the class as well as self-proposed systems using a variety of technologies and tools. Given in alternate years; offered 2022-23.
This course is an introduction to quantum cryptography: how to use quantum effects, such as quantum entanglement and uncertainty, to implement cryptographic tasks with levels of security that are impossible to achieve classically. The course covers the fundamental ideas of quantum information that form the basis for quantum cryptography, such as entanglement and quantifying quantum knowledge. We will introduce the security definition for quantum key distribution and see protocols and proofs of security for this task. We will also discuss the basics of device-independent quantum cryptography as well as other cryptographic tasks and protocols, such as bit commitment or position-based cryptography. Not offered 2022-23.
Introduction to the basic theory and usage of relational database systems. It covers the relational data model, relational algebra, and the Structured Query Language (SQL). The course introduces the basics of database schema design and covers the entity-relationship model, functional dependency analysis, and normal forms. Additional topics include other query languages based on the relational calculi, data-warehousing and dimensional analysis, writing and using stored procedures, working with hierarchies and graphs within relational databases, and an overview of transaction processing and query evaluation. Extensive hands-on work with SQL databases.
Database System Implementation
This course explores the theory, algorithms, and approaches behind modern relational database systems. Topics include file storage formats, query planning and optimization, query evaluation, indexes, transaction processing, concurrency control, and recovery. Assignments consist of a series of programming projects extending a working relational database, giving hands-on experience with the topics covered in class. The course also has a strong focus on proper software engineering practices, including version control, testing, and documentation. Not offered 2022-23.
Projects in Database Systems
Students are expected to execute a substantial project in databases, write up a report describing their work, and make a presentation. Not offered 2022-23.
This course explores the major themes and components of modern operating systems, such as kernel architectures, the process abstraction and process scheduling, system calls, concurrency within the OS, virtual memory management, and file systems. Students must work in groups to complete a series of challenging programming projects, implementing major components of an instructional operating system. Most programming is in C, although some IA32 assembly language programming is also necessary. Familiarity with the material in CS 24 is strongly advised before attempting this course.
Digital Circuit Design with FPGAs and VHDL
Study of programmable logic devices (FPGAs). Detailed study of the VHDL language, accompanied by tutorials of popular synthesis and simulation tools. Review of combinational circuits (both logic and arithmetic), followed by VHDL code for combinational circuits and corresponding FPGA-implemented designs. Review of sequential circuits, followed by VHDL code for sequential circuits and corresponding FPGA-implemented designs. Review of finite state machines, followed by VHDL code for state machines and corresponding FPGA-implemented designs. Final project. The course includes a wide selection of real-world projects, implemented and tested using FPGA boards.
Shannon's mathematical theory of communication, 1948-present. Entropy, relative entropy, and mutual information for discrete and continuous random variables. Shannon's source and channel coding theorems. Mathematical models for information sources and communication channels, including memoryless, Markov, ergodic, and Gaussian. Calculation of capacity and rate-distortion functions. Universal source codes. Side information in source coding and communications. Network information theory, including multiuser data compression, multiple access channels, broadcast channels, and multiterminal networks. Discussion of philosophical and practical implications of the theory. This course, when combined with EE 112, EE/Ma/CS/IDS 127, EE/CS 161, and EE/CS/IDS 167, should prepare the student for research in information theory, coding theory, wireless communications, and/or data compression. EE/Ma/CS 126 a offered 2022-23; EE/Ma/CS 126 b not offered 2022-23.
This course develops from first principles the theory and practical implementation of the most important techniques for combating errors in digital transmission or storage systems. Topics include highly symmetric linear codes, such as Hamming, Reed-Muller, and Polar codes; algebraic block codes, e.g., BCH, Reed-Solomon (including a self-contained introduction to the theory of finite fields); and sparse graph codes with iterative decoding, i.e., LDPC code and turbo codes. Students will become acquainted with encoding and decoding algorithms, design principles and performance evaluation of codes. Not offered 2022-23.
Interactive Theorem Proving
This course introduces students to the modern practice of interactive tactic-based theorem proving using the Coq theorem prover. Topics will be drawn from logic, programming languages and the theory of computation. Topics will include: proof by induction, lists, higher-order functions, polymorphism, dependently-typed functional programming, constructive logic, the Curry-Howard correspondence, modeling imperative programs, and other topics if time permits. Students will be graded partially on attendance and will be expected to participate in proving theorems in class.
This course covers the foundations of experimental realization on robotic systems. This includes software infrastructure to operate physical hardware, integrate various sensor modalities, and create robust autonomous behaviors. Using the Python programming language, assignments will explore techniques from simple polling to interrupt driven and multi-threaded architectures, ultimately utilizing the Robotic Operating System (ROS). Developments will be integrated on mobile robotic systems and demonstrated in the context of class projects.
This course presents a survey of software engineering principles relevant to all aspects of the software development lifecycle. Students will examine industry best practices in the areas of software specification, development, project management, testing, and release management, including a review of the relevant research literature. Assignments give students the opportunity to explore these topics in depth. Programming assignments use Python and Git, and students should be familiar with Python at a CS 1 level, and Git at a CS 2/CS 3 level, before taking the course.
CS 131 is a course on programming languages and their implementation. It teaches students how to program in a number of simplified languages representing the major programming paradigms in use today (imperative, object-oriented, and functional). It will also teach students how to build and modify the implementations of these languages. Emphasis will not be on syntax or parsing but on the essential differences in these languages and their implementations. Both dynamically-typed and statically-typed languages will be implemented. Relevant theory will be covered as needed. Implementations will mostly be interpreters, but some features of compilers will be covered if time permits. Enrollment limited to 30 students.
Covers full-stack web development with HTML5, CSS, client-side JS (ES6) and server-side JS (Node.js/Express) for web API development. Concepts including separation of concerns, the client-server relationship, user experience, accessibility, and security will also be emphasized throughout the course. Assignments will alternate between formal and semi-structured student-driven projects, providing students various opportunities to apply material to their individual interests. No prior web development background is required, though students who have prior experience may still benefit from learning best practices and HTML5/ES6 standards.
The course develops the core concepts of robotics. The first quarter focuses on classical robotic manipulation, including topics in rigid body kinematics and dynamics. It develops planar and 3D kinematic formulations and algorithms for forward and inverse computations, Jacobians, and manipulability. The second quarter transitions to planning, navigation, and perception. Topics include configuration space, sample-based planners, A* and D* algorithms, to achieve collision-free motions. Course work transitions from homework and programming assignments to more open-ended team-based projects.
This course builds up, and brings to practice, the elements of robotic systems at the intersection of hardware, kinematics and control, computer vision, and autonomous behaviors. It presents selected topics from these domains, focusing on their integration into a full sense-think-act robot. The lectures will drive team-based projects, progressing from building custom robotic arms (5 to 7 degrees of freedom) to writing all necessary software (utilizing the Robotics Operating system, ROS). Teams are required to implement and customize general concepts for their selected tasks. Working systems will autonomously operate and demonstrate their capabilities during final presentations.
Power System Analysis
We are at the beginning of a historic transformation to decarbonize our energy system. This course introduces the basics of power systems analysis: phasor representation, 3-phase transmission system, transmission line models, transformer models, per-unit analysis, network matrix, power flow equations, power flow algorithms, optimal powerflow (OPF) problems, unbalanced power flow analysis and optimization,swing dynamics and stability.
Information Theory and Applications
This class introduces information measures such as entropy, information divergence, mutual information, information density, and discusses the relations of those quantities to problems in data compression and transmission, statistical inference, and control. The course does not require a prior exposure to information theory; it is complementary to EE 126 a.
Real-World Algorithm Implementation
This course introduces algorithms in the context of their usage in the real world. The course covers compression, semi-numerical algorithms, RSA cryptography, parsing, and string matching. The goal of the course is for students to see how to use theoretical algorithms in real-world contexts, focusing both on correctness and the nitty-gritty details and optimizations. Students will choose to implement projects based on depth in an area or breadth to cover all the topics.
This course is identical to CS 38. Only graduate students for whom this is the first algorithms course are allowed to register for CS 138. See the CS 38 entry for prerequisites and course description.
Analysis and Design of Algorithms
This course develops core principles for the analysis and design of algorithms. Basic material includes mathematical techniques for analyzing performance in terms of resources, such as time, space, and randomness. The course introduces the major paradigms for algorithm design, including greedy methods, divide-and-conquer, dynamic programming, linear and semidefinite programming, randomized algorithms, and online learning. Not offered 2022-23.
Hack Society: Projects from the Public Sector
There is a large gap between the public and private sectors' effective use of technology. This gap presents an opportunity for the development of innovative solutions to problems faced by society. Students will develop technology-based projects that address this gap. Course material will offer an introduction to the design, development, and analysis of digital technology with examples derived from services typically found in the public sector. Not offered 2022-23.
Programming distributed systems. Mechanics for cooperation among concurrent agents. Programming sensor networks and cloud computing applications. Applications of machine learning and statistics by using parallel computers to aggregate and analyze data streams from sensors.
This course focuses on the link layer (two) through the transport layer (four) of Internet protocols. It has two distinct components, analytical and systems. In the analytical part, after a quick summary of basic mechanisms on the Internet, we will focus on congestion control and explain: (1) How to model congestion control algorithms? (2) Is the model well defined? (3) How to characterize the equilibrium points of the model? (4) How to prove the stability of the equilibrium points? We will study basic results in ordinary differential equations, convex optimization, Lyapunov stability theorems, passivity theorems, gradient descent, contraction mapping, and Nyquist stability theory. We will apply these results to prove equilibrium and stability properties of the congestion control models and explore their practical implications. In the systems part, the students will build a software simulator of Internet routing and congestion control algorithms. The goal is not only to expose students to basic analytical tools that are applicable beyond congestion control, but also to demonstrate in depth the entire process of understanding a physical system, building mathematical models of the system, analyzing the models, exploring the practical implications of the analysis, and using the insights to improve the design. Not offered 2022-23.
Networks: Structure & Economics
Social networks, the web, and the internet are essential parts of our lives, and we depend on them every day. This course studies how they work and the "big" ideas behind our networked lives. Questions explored include: What do networks actually look like (and why do they all look the same)?; How do search engines work?; Why do memes spread the way they do?; How does web advertising work? For all these questions and more, the course will provide a mixture of both mathematical analysis and hands-on labs. The course expects students to be comfortable with graph theory, probability, and basic programming.
Projects in Networking
Students are expected to execute a substantial project in networking, write up a report describing their work, and make a presentation.
Control and Optimization of Networks
This is a research-oriented course meant for undergraduates and beginning graduate students who want to learn about current research topics in networks such as the Internet, power networks, social networks, etc. The topics covered in the course will vary, but will be pulled from current research in the design, analysis, control, and optimization of networks. Usually offered in odd years.
Digital Ventures Design
This course aims to offer the scientific foundations of analysis, design, development, and launching of innovative digital products and study elements of their success and failure. The course provides students with an opportunity to experience combined team-based design, engineering, and entrepreneurship. The lectures present a disciplined step-by-step approach to develop new ventures based on technological innovation in this space, and with invited speakers, cover topics such as market analysis, user/product interaction and design, core competency and competitive position, customer acquisition, business model design, unit economics and viability, and product planning. Throughout the term students will work within an interdisciplinary team of their peers to conceive an innovative digital product concept and produce a business plan and a working prototype. The course project culminates in a public presentation and a final report. Every year the course and projects focus on a particular emerging technology theme. Not offered 2022-23.
Selected Topics in Computational Vision
The class will focus on an advanced topic in computational vision: recognition, vision-based navigation, 3-D reconstruction. The class will include a tutorial introduction to the topic, an exploration of relevant recent literature, and a project involving the design, implementation, and testing of a vision system.
This course will equip students to engage with active research at the intersection of social and information sciences, including: algorithmic game theory and mechanism design; auctions; matching markets; and learning in games.
Probability and Algorithms
Part a: The probabilistic method and randomized algorithms. Deviation bounds, k-wise independence, graph problems, identity testing, derandomization and parallelization, metric space embeddings, local lemma. Part b: Further topics such as weighted sampling, epsilon-biased sample spaces, advanced deviation inequalities, rapidly mixing Markov chains, analysis of boolean functions, expander graphs, and other gems in the design and analysis of probabilistic algorithms. Parts a & b are given in alternate years.
This course describes a diverse array of complexity classes that are used to classify problems according to the computational resources (such as time, space, randomness, or parallelism) required for their solution. The course examines problems whose fundamental nature is exposed by this framework, the known relationships between complexity classes, and the numerous open problems in the area.
Introduction to Cryptography
This course is an introduction to the foundations of cryptography. The first part of the course introduces fundamental constructions in private-key cryptography, including one-way functions, pseudo-random generators and authentication, and in public-key cryptography, including trapdoor one-way functions, collision-resistant hash functions and digital signatures. The second part of the course covers selected topics such as interactive protocols and zero knowledge, the learning with errors problem and homomorphic encryption, and quantum cryptography: quantum money, quantum key distribution. The course is mostly theoretical and requires mathematical maturity. There will be a small programming component. Not offered 2022-23.
Current Topics in Theoretical Computer Science
May be repeated for credit, with permission of the instructor. Students in this course will study an area of current interest in theoretical computer science. The lectures will cover relevant background material at an advanced level and present results from selected recent papers within that year's chosen theme. Students will be expected to read and present a research paper.
Machine Learning & Data Mining
This course will cover popular methods in machine learning and data mining, with an emphasis on developing a working understanding of how to apply these methods in practice. The course will focus on basic foundational concepts underpinning and motivating modern machine learning and data mining approaches. We will also discuss recent research developments.
Introduction to the theory, algorithms, and applications of automated learning. How much information is needed to learn a task, how much computation is involved, and how it can be accomplished. Special emphasis will be given to unifying the different approaches to the subject coming from statistics, function approximation, optimization, pattern recognition, and neural networks.
Statistical Inference is a branch of mathematical engineering that studies ways of extracting reliable information from limited data for learning, prediction, and decision making in the presence of uncertainty. This is an introductory course on statistical inference. The main goals are: develop statistical thinking and intuitive feel for the subject; introduce the most fundamental ideas, concepts, and methods of statistical inference; and explain how and why they work, and when they don't. Topics covered include summarizing data, fundamentals of survey sampling, statistical functionals, jackknife, bootstrap, methods of moments and maximum likelihood, hypothesis testing, p-values, the Wald, Student's t-, permutation, and likelihood ratio tests, multiple testing, scatterplots, simple linear regression, ordinary least squares, interval estimation, prediction, graphical residual analysis.
Fundamentals of Statistical Learning
The main goal of the course is to provide an introduction to the central concepts and core methods of statistical learning, an interdisciplinary field at the intersection of statistics, machine learning, information and data sciences. The course focuses on the mathematics and statistics of methods developed for learning from data. Students will learn what methods for statistical learning exist, how and why they work (not just what tasks they solve and in what built-in functions they are implemented), and when they are expected to perform poorly. The course is oriented for upper level undergraduate students in IDS, ACM, and CS and graduate students from other disciplines who have sufficient background in probability and statistics. The course can be viewed as a statistical analog of CMS/CS/CNS/EE/IDS 155. Topics covered include supervised and unsupervised learning, regression and classification problems, linear regression, subset selection, shrinkage methods, logistic regression, linear discriminant analysis, resampling techniques, tree-based methods, support-vector machines, and clustering methods. Not offered 2022-23.
Advanced Topics in Machine Learning
This course focuses on current topics in machine learning research. This is a paper reading course, and students are expected to understand material directly from research articles. Students are also expected to present in class, and to do a final project.
Fundamentals of Information Transmission and Storage
Basics of information theory: entropy, mutual information, source and channel coding theorems. Basics of coding theory: error-correcting codes for information transmission and storage, block codes, algebraic codes, sparse graph codes. Basics of digital communications: sampling, quantization, digital modulation, matched filters, equalization.
Big Data Networks
Next generation networks will have tens of billions of nodes forming cyber-physical systems and the Internet of Things. A number of fundamental scientific and technological challenges must be overcome to deliver on this vision. This course will focus on (1) How to boost efficiency and reliability in large networks; the role of network coding, distributed storage, and distributed caching; (2) How to manage wireless access on a massive scale; modern random access and topology formation techniques; and (3) New vistas in big data networks, including distributed computing over networks and crowdsourcing. A selected subset of these problems, their mathematical underpinnings, state-of-the-art solutions, and challenges ahead will be covered. Given in alternate years. Not offered 2022-23.
Data, Algorithms and Society
This course examines algorithms and data practices in fields such as machine learning, privacy, and communication networks through a social lens. We will draw upon theory and practices from art, media, computer science and technology studies to critically analyze algorithms and their implementations within society. The course includes projects, lectures, readings, and discussions. Students will learn mathematical formalisms, critical thinking and creative problem solving to connect algorithms to their practical implementations within social, cultural, economic, legal and political contexts. Enrollment by application. Taught concurrently with VC 72 and can only be taken once as CS/IDS 162 or VC 72.
This course covers the construction of compilers: programs which convert program source code to machine code which is directly executable on modern hardware. The course takes a bottom-up approach: a series of compilers will be built, all of which generate assembly code for x86 processors, with each compiler adding features. The final compiler will compile a full-fledged high-level programming language to assembly language. Topics covered include register allocation, conditionals, loops and dataflow analysis, garbage collection, lexical scoping, and type checking. This course is programming intensive. All compilers will be written in the OCaml programming language.
Foundations of Machine Learning and Statistical Inference
The course assumes students are comfortable with analysis, probability, statistics, and basic programming. This course will cover core concepts in machine learning and statistical inference. The ML concepts covered are spectral methods (matrices and tensors), non-convex optimization, probabilistic models, neural networks, representation theory, and generalization. In statistical inference, the topics covered are detection and estimation, sufficient statistics, Cramer-Rao bounds, Rao-Blackwell theory, variational inference, and multiple testing. In addition to covering the core concepts, the course encourages students to ask critical questions such as: How relevant is theory in the age of deep learning? What are the outstanding open problems? Assignments will include exploring failure modes of popular algorithms, in addition to traditional problem-solving type questions.
Computational cameras overcome the limitations of traditional cameras, by moving part of the image formation process from hardware to software. In this course, we will study this emerging multi-disciplinary field at the intersection of signal processing, applied optics, computer graphics, and vision. At the start of the course, we will study modern image processing and image editing pipelines, including those encountered on DSLR cameras and mobile phones. Then we will study the physical and computational aspects of tasks such as coded photography, light-field imaging, astronomical imaging, medical imaging, and time-of-flight cameras. The course has a strong hands-on component, in the form of homework assignments and a final project. In the homework assignments, students will have the opportunity to implement many of the techniques covered in the class. Example homework assignments include building an end-to-end HDR (High Dynamic Range) imaging pipeline, implementing Poisson image editing, refocusing a light-field image, and making your own lensless "scotch-tape" camera. Not offered 2022-23.
Introduction to Data Compression and Storage
The course will introduce the students to the basic principles and techniques of codes for data compression and storage. The students will master the basic algorithms used for lossless and lossy compression of digital and analog data and the major ideas behind coding for flash memories. Topics include the Huffman code, the arithmetic code, Lempel-Ziv dictionary techniques, scalar and vector quantizers, transform coding; codes for constrained storage systems. Given in alternate years; not offered 2022-23.
Mobile robots need to perceive their environment and localize themselves with respect to maps thereof. They further require planners to move along collision-free paths. This course builds up mobile robots in team-based projects. Teams will write all necessary software from low-level hardware I/O to high level algorithms, using the robotic operating system (ROS). The final systems will autonomously maneuver to reach their goals or track various objectives.
Computer Graphics Laboratory
This is a challenging course that introduces the basic ideas behind computer graphics and some of its fundamental algorithms. Topics include graphics input and output, the graphics pipeline, sampling and image manipulation, three-dimensional transformations and interactive modeling, basics of physically based modeling and animation, simple shading models and their hardware implementation, and some of the fundamental algorithms of scientific visualization. Students will be required to perform significant implementations.
Computer Graphics Projects
This laboratory class offers students an opportunity for independent work including recent computer graphics research. In coordination with the instructor, students select a computer graphics modeling, rendering, interaction, or related algorithm and implement it. Students are required to present their work in class and discuss the results of their implementation and possible improvements to the basic methods. May be repeated for credit with instructor's permission. Not offered 2022-23.
Advanced Topics in Digital Design with FPGAs and VHDL
Quick review of the VHDL language and RTL concepts. Dealing with sophisticated, multi-dimensional data types in VHDL. Dealing with multiple time domains. Transfer of control versus data between clock domains. Clock division and multiplication. Using PLLs. Dealing with global versus local and synchronous versus asynchronous resets. How to measure maximum speed in FPGAs (for both registered and unregistered circuits). The (often) hard task of time closure. The subtleties of the time behavior in state machines (a major source of errors in large, complex designs). Introduction to simulation. Construction of VHDL testbenches for automated testing. Dealing with files in simulation. All designs are physically implemented using FPGA boards.
Computer Graphics Research
The course will go over recent research results in computer graphics, covering subjects from mesh processing (acquisition, compression, smoothing, parameterization, adaptive meshing), simulation for purposes of animation, rendering (both photo- and nonphotorealistic), geometric modeling primitives (image based, point based), and motion capture and editing. Other subjects may be treated as they appear in the recent literature. The goal of the course is to bring students up to the frontiers of computer graphics research and prepare them for their own research. Not offered 2022-23.
Discrete Differential Geometry: Theory and Applications
Working knowledge of multivariate calculus and linear algebra as well as fluency in some implementation language is expected. Subject matter covered: differential geometry of curves and surfaces, classical exterior calculus, discrete exterior calculus, sampling and reconstruction of differential forms, low dimensional algebraic and computational topology, Morse theory, Noether's theorem, Helmholtz-Hodge decomposition, structure preserving time integration, connections and their curvatures on complex line bundles. Applications include elastica and rods, surface parameterization, conformal surface deformations, computation of geodesics, tangent vector field design, connections, discrete thin shells, fluids, electromagnetism, and elasticity. Part b not offered 2022-23.
Numerical Algorithms and their Implementation
This course gives students the understanding necessary to choose and implement basic numerical algorithms as needed in everyday programming practice. Concepts include: sources of numerical error, stability, convergence, ill-conditioning, and efficiency. Algorithms covered include solution of linear systems (direct and iterative methods), orthogonalization, SVD, interpolation and approximation, numerical integration, solution of ODEs and PDEs, transform methods (Fourier, Wavelet), and low rank approximation such as multipole expansions. Not offered 2022-23.
Some experience with computer graphics algorithms preferred. The use of Graphics Processing Units for computer graphics rendering is well known, but their power for general parallel computation is only recently being explored. Parallel algorithms running on GPUs can often achieve up to 100x speedup over similar CPU algorithms. This course covers programming techniques for the Graphics processing unit, focusing on visualization and simulation of various systems. Labs will cover specific applications in graphics, mechanics, and signal processing. The course will use nVidia's parallel computing architecture, CUDA. Labwork requires extensive programming.
Master’s Thesis Research
Introduction to Computational Biology and Bioinformatics
Biology is becoming an increasingly data-intensive science. Many of the data challenges in the biological sciences are distinct from other scientific disciplines because of the complexity involved. This course will introduce key computational, probabilistic, and statistical methods that are common in computational biology and bioinformatics. We will integrate these theoretical aspects to discuss solutions to common challenges that reoccur throughout bioinformatics including algorithms and heuristics for tackling DNA sequence alignments, phylogenetic reconstructions, evolutionary analysis, and population and human genetics. We will discuss these topics in conjunction with common applications including the analysis of high throughput DNA sequencing data sets and analysis of gene expression from RNA-Seq data sets.
Vision: From Computational Theory to Neuronal Mechanisms
Lecture, laboratory, and project course aimed at understanding visual information processing, in both machines and the mammalian visual system. The course will emphasize an interdisciplinary approach aimed at understanding vision at several levels: computational theory, algorithms, psychophysics, and hardware (i.e., neuroanatomy and neurophysiology of the mammalian visual system). The course will focus on early vision processes, in particular motion analysis, binocular stereo, brightness, color and texture analysis, visual attention and boundary detection. Students will be required to hand in approximately three homework assignments as well as complete one project integrating aspects of mathematical analysis, modeling, physiology, psychophysics, and engineering. Given in alternate years; not offered 2022-23.
This course aims at a quantitative understanding of how the nervous system computes. The goal is to link phenomena across scales from membrane proteins to cells, circuits, brain systems, and behavior. We will learn how to formulate these connections in terms of mathematical models, how to test these models experimentally, and how to interpret experimental data quantitatively. The concepts will be developed with motivation from some of the fascinating phenomena of animal behavior, such as: aerobatic control of insect flight, precise localization of sounds, sensing of single photons, reliable navigation and homing, rapid decision-making during escape, one-shot learning, and large-capacity recognition memory.
This course investigates computation by molecular systems, emphasizing models of computation based on the underlying physics, chemistry, and organization of biological cells. We will explore programmability, complexity, simulation of, and reasoning about abstract models of chemical reaction networks, molecular folding, molecular self-assembly, and molecular motors, with an emphasis on universal architectures for computation, control, and construction within molecular systems. If time permits, we will also discuss biological example systems such as signal transduction, genetic regulatory networks, and the cytoskeleton; physical limits of computation, reversibility, reliability, and the role of noise, DNA-based computers and DNA nanotechnology. Part a develops fundamental results; part b is a reading and research course: classic and current papers will be discussed, and students will do projects on current research topics.
Design and Construction of Programmable Molecular Systems
This course will introduce students to the conceptual frameworks and tools of computer science as applied to molecular engineering, as well as to the practical realities of synthesizing and testing their designs in the laboratory. In part a, students will design and construct DNA circuits and self-assembled DNA nanostructures, as well as quantitatively analyze the designs and the experimental data. Students will learn laboratory techniques including fluorescence spectroscopy and atomic force microscopy and will use software tools and program in Mathematica. Part b is an open-ended design and build project requiring instructor’s permission for enrollment. Enrollment in part a is limited to 24 students, and part b limited to 8 students.
The theory of quantum information and quantum computation. Overview of classical information theory, compression of quantum information, transmission of quantum information through noisy channels, quantum error-correcting codes, quantum cryptography and teleportation. Overview of classical complexity theory, quantum complexity, efficient quantum algorithms, fault-tolerant quantum computation, physical implementations of quantum computation.
Topics in Computer Graphics
Each term will focus on some topic in computer graphics, such as geometric modeling, rendering, animation, human-computer interaction, or mathematical foundations. The topics will vary from year to year. May be repeated for credit with instructor's permission. Not offered 2022-23.
Research in Computer Science
Approval of student's research adviser and option adviser must be obtained before registering.
Reading in Computer Science
Instructor's permission required.
Seminar in Computer Science
Instructor's permission required.
Center for the Mathematics of Information Seminar
Instructor's permission required.