last update: 21 August 2002
| CSC 320
Advanced File Structures |
4 cr. |
This course will elaborate on basic data structures and access algorithms as
they pertain to data stored on disk, and will present additional algorithms
pertinent to the problem of rapid data storage/retrieval when information is stored
on a persistent-storage device. Attention will be paid to implementation considerations
and space/time tradeoffs, and to the choice of appropriate structures for the
solution of typical problems. Three lecture hours and three hours of scheduled
laboratory per week, plus extensive laboratory work outside of class.
Prerequisites: CSC 260 with
grade of C+ or higher.
Goals:
The purpose of this course is to develop students' understanding of
data organization and retrieval mechanisms in the context of non-memory-based storage media.
The goals of this course are:
- CG1: to develop an understanding of the physical factors governing non-memory-based
storage and retrieval of data;
- CG2: to present a catalog of the most common file-based data structures and their
standard implementations;
- CG3: to develop the skills and knowledge necessary to critically analyze, select and
implement file structures.
Upon completion of the course, a student should be able to select an appropriate structure
to match the requirements of a given problem, implement the structure if necessary, and
use it in an appropriate way to solve the problem.
Objectives:
Upon successful completion of the course, the student will have:
- CO1: demonstrated knowledge of the fundamental concepts of file input and output;
- CO2: demonstrated knowledge of classic data organization structures and
retrieval and optimization techniques;
- CO3: implemented structures "from scratch" and utilized implemented structures via libraries;
- CO4: performed order-of-magnitude analysis on structures and retrieval algorithms for
evaluation and selection;
- CO5: developed and implemented a plan for unit testing and integration;
- CO6: utilized the UML modeling language and a CASE tool to enhance development,
implementation and documentation;
- CO7: participated in and contributed to group projects involving problem analysis,
design specification and selection, implementation, integration, and validation;
- CO8: produced documentation, including formal class diagrams and rigorous test set
specification and results;
- CO9: presented the results of a project in class.
Topics:
- review of proper program structure utilizing the design techniques of object-oriented
design, modularization and stepwise refinement;
- general review of programming language features
- detailed review of sequential I/O
- table processing
- creation of simple sequential files
- physical file structures
- blocking
- buffering
- paging
- address calculation
- data validation considerations
- generation of test data for specific condition testing and for volume tests
- sequential file updating
- review of algorithm analysis/big-O
- non-sequential file types
- indexing techniques
- hashing techniques
- tree structures
- binary trees
- b-trees
- b-tree variants
- inter- and intra-file organizations
- linked lists
- chaining
- pointers
- solution design criteria
- file life cycles chronolgy of
- creating
- using
- archiving/deleting files
- database systems as file managers
- discussion of database query implementation
- relational database systems
- object-relational systems
The emphasis of the course is on the various algorithms
that allow rapid data access and the techniques necessary to produce efficient
implementations. Considerable time will be spent discussing how to evaluate
performance, both in a theoretical format and by using test executions of the
program, and how to determine the appropriate structure(s) for a particular problem.
Extensive laboratory work, group discussion time and group projects conducted as part
of the scheduled laboratory sessions are an integral component of the course, serving
to reinforce the concepts and techniques presented in lecture and to give students
experience with group software development projects. There will be four to seven
programming projects. Knowledge of problem analysis, structured design techniques,
basic programming techniques and data structures is assumed (see prerequisite).
All programs must conform to departmental guidelines for
program design and implementation, and all lab reports must conform to guidelines
announced in class. Regardless of numeric average, a student will not be eligible
for a passing grade unless he or she has submitted a lab report for every programming
assignment.
The course grade will be determined using the following
approximate weights: programming projects 40%; midterm 20%; final exam
25%; homework 15%.
Bibliography:
- Cooper. Design of Library Automation Systems: File Structures,
Data Structures, and Tools. John Wiley, 1996.
- Folk & Zoellick. File Structures: An Object-Oriented Approach.
Addison-Wesley, 1998. ISBN: 0-201-87401-6
- Garcia-Molina, Hector; Ullman, Jeffrey; Widom, Jennifer. Database
System Implementation. Prentice-Hall, 2000. ISBN: 0-13-040264-8
- Giampaolo. Practical File System Design with the Be File System.
Morgan Kaufman, 1999.
- Wiederhold, Gio. File Organization for Database Design.
McGraw-Hill, 1987.
Salem State Home •
Computer Science Home •
Faculty •
Computer Studies Major •
Flow Sheet
Computer Studies Minor •
Courses •
Course Sequence Diagram •
Computer Laboratories
|