Enforced modularity on a single machine via virtualization.
Virtual memory, bounded buffers, threads.
Saw monolithic vs. microkernels.
Talked about VMs as a means to run multiple instances of an OS on a single machine with enforced modularity (bug in one OS won't crash the others).
Big thing to solve was how to implement the VMM. Solution: Trap and emulate. How the emulation works depends on the situation.
Another key problem: How to trap instructions that don't generate interrupts.
What's left? Performance
Performance requirements significantly influence a system's design.
Today: General techniques for improving performance.
Technique 1: Buy New Hardware
Why? Moore's law => processing power doubles every 1.5 years, DRAM density increase over time, disk price (per GB) decreases, ...
But:
Not all aspects improve at the same pace.
Moore's Law is plateauing.
Hardware improvements don't always keep pace with load increases.
Conclusion: Need to design for performance, potentially re-design as load increases.
General Approach
Measure the system and find the bottleneck (the portion that limits performance).
Relax (improve) the bottleneck.
Measurement
To measure, need metrics:
Throughput: Number of requests over a unit of time.
Latency: Amount of time for a single request.
Relationship between these changes depending on the context.
As system becomes heavily-loaded:
Latency and throughput start low. Throughput increases as users enter, latency stays flat...
..until system is at maximum throughput. Then throughput plateaus, latency increases.
For heavily-loaded systems: Focus on improving throughput.
Need to compare measured throughput to possible throughput: Utilization.
Utilization sometimes makes bottleneck obvious (CPU is 100% utilized vs. disk is 20% utilized), sometimes not (CPU and disk are 50% utilized, and at alternating times).
Helpful to have a model in place: What do we expect from each component?
When bottleneck is not obvious, use measurements to locate candidates for bottlenecks, fix them, see what happens (iterate).
How to Relax the Bottleneck
Better algorithms, etc. These are application-specific. 6.033 focuses on generally-applicable techniques.
Batching, caching, concurrency, scheduling.
Examples of these techniques follow. The examples related to operating systems (that's what you know), but techniques apply to all systems.
Disk Throughput
How does an HDD (magnetic disk) work?
Several platters on a rotating axle.
Platters have circular tracks on either side, divided into sectors.
Cylinder: Group of aligned tracks.
Disk arm has one head for each surface, all move together.
Each disk head reads/writes sectors as they rotate past. Size of a sector = unit of read/write operation (typically 512B).
To read/write:
Seek arm to desired track.
Wait for platter to rotate the desired sector under the head.
Read/write as the platter rotates.
What about SSDs?
Organized into cells, each of which hold one (or 2, or 3) bits.
Cells organized into pages; pages into blocks.
Reads happen at page-level. Writes also at page-level, but to new pages (no overwrites of pages).
Erases (and thus overwrites) are at block-level.
Takes a high voltage to erase.
How long does R/W take on HDD?
Example disk specs:
Capacity: 400GB
Platters: 5
# heads: 10
# sectors per track: 567–1170 (inner to outer)
# bytes per sector: 512
Rotational speed: 7200 RPM => 8.3ms per revolution
So reading random 4KB block: 8.2ms + 4.1ms + ~.1ms = 12.4
4096 B / 12.4 ms = 322KB/s. => 99% of the time is spent moving the disk.
Can we do better?
Use flash? For this particular random-access of reads, yes; SSDs would help if available.
Batch individual transfers?
.8ms to seek to next track + 8.3ms to read entire track = 9.1ms.
.8ms is single-track seek time for our disk (again, from specs).
1 track contains ~1000sectors * 512B = 512KB.
Throughput: 512KB/9.1ms = 55MB/s.
Lesson: Avoid random access. Try to do long sequential reads.
But how?
If your system reads/writes entire big files, lay them out contiguously on disk. Hard to achieve in practice!
If your system reads lots of small pieces of data, group them.
Caching
Already saw in DNS. Common performance-enhancement for systems.
How do we measure how well it works?
Average access time: Hit_time * hit_rate + miss_time * miss_rate.
Want high hit rate. How do we know what to put in the cache?
Can't keep everything.
So really: How do we know what to *evict* from the cache?
Popular eviction policy: Least-recently used.
Evict data that was used the least recently.
Works well for popular data.
Bad for sequential access (think: Sequentially accessing a dataset that is larger than the cache).
Caching is good when:
All data fits in the cache.
There is locality, temporal or spatial.
Caching is bad for:
Writes (writes have to go to cache and disk; cache needs to be consistent, but disk is non-volatile).
Moral: To build a good cache, need to understand access patterns
Like disk performance: To relax disk as bottleneck, needed to understand details of how it works
Concurrency/Scheduling
Suppose server alternates between CPU and disk:
CPU: --A-- --B-- --C--
Disk: --A-- --B-- --C--
Apply concurrency, can get:
CPU: --A----B----C-- ...
Disk: --A----B-- ..
This is a scheduling problem: Different orders of execution can lead to different performance.
Example:
5 concurrent threads issue concurrent reads to sectors 71, 10, 92, 45, and 29.
Naive algorithm: Seek to each sector in turn.
Better algorithm: Sort by track and perform reads in order. Gets even higher throughput as load increases.
Drawback: It's unfair.
No one right answer to scheduling. Tradeoff between performance and fairness.
Parallelism
Goal: Have multiple disks, want to access them in parallel.
Problem: How do we divide data across the disks?
Depends on bottleneck:
Case 1: Many requests for many small files. Limited by disk seeks. Put each file on a single disk, and allow multiple disks to seek multiple records in parallel.
Case 2: Few large reads. Limited by sequential throughput. Stripe files across disks.
Another case: Parallelism across many computers.
Problem: How do we deal with machine failures?
(One) Solution: Go to recitation tomorrow!
Summary
We can't magically apply any of the previous techniques. Have to understand what goes on underneath.
Batching: How disk access works.
Caching: What is the access pattern?
Scheduling/concurrency: How disk access works, how system is being used (the workload).
Parallelism: What is the workload?
Techniques apply to multiple types of hardware.
E.g., caching is useful regardless of whether you have HDD or SSD.