1 File Read/Write
Ruby provides multiple approaches for file I/O. The most common are File.read/File.write for simple operations and File.open with a block for more control. The block form automatically closes the file handle.
Reading Files
content = File.read('config.txt')
puts content
content = File.read('data.txt', encoding: 'UTF-8')
lines = File.readlines('log.txt')
lines.each_with_index do |line, i|
puts "#{i + 1}: #{line.chomp}"
end
lines = File.readlines('log.txt', chomp: true)
File.foreach('large_file.log') do |line|
puts line if line.include?('ERROR')
end
Writing Files
File.write('output.txt', "Hello, Ruby!\nLine 2\n")
File.write('log.txt', "#{Time.now} - New entry\n", mode: 'a')
File.open('report.txt', 'w') do |f|
f.puts 'Report Title'
f.puts '=' * 40
10.times { |i| f.puts "Item #{i + 1}: #{rand(100)}" }
end
File Open Modes
| Mode | Description |
|---|---|
| 'r' | Read only (default). File must exist. |
| 'w' | Write only. Creates file or truncates existing. |
| 'a' | Append only. Creates file if not exists. |
| 'r+' | Read and write. File must exist. |
| 'w+' | Read and write. Truncates or creates. |
| 'a+' | Read and append. Creates if not exists. |
File.open with Block (Idiomatic Ruby)
File.open('data.bin', 'rb') do |f|
header = f.read(4)
puts "File size: #{f.size} bytes"
puts "Current position: #{f.pos}"
f.seek(0, IO::SEEK_SET)
end
result = File.open('numbers.txt', 'r') do |f|
f.readlines(chomp: true).map(&:to_i).sum
end
puts "Sum: #{result}"
β‘ Always Use Block Form
File.open with a block ensures the file is closed automatically when the block exits, even if an exception occurs. This is equivalent to Python's with open() or Go's defer f.Close().
2 Directory Operations
Ruby provides Dir for directory listing and creation, FileUtils for recursive operations, and Pathname for object-oriented path manipulation.
Creating & Listing Directories
Dir.mkdir('output') unless Dir.exist?('output')
require 'fileutils'
FileUtils.mkdir_p('path/to/nested/dir')
entries = Dir.entries('.')
puts entries.reject { |e| e.start_with?('.') }
Dir.foreach('.') do |entry|
next if entry.start_with?('.')
type = File.directory?(entry) ? 'DIR' : 'FILE'
puts "#{type}: #{entry}"
end
Dir.glob (Pattern Matching)
ruby_files = Dir.glob('**/*.rb')
puts "Found #{ruby_files.size} Ruby files"
Dir.glob('app/**/*.{rb,erb}').each do |path|
puts path
end
Dir.glob('log/*.log').sort_by { |f| File.mtime(f) }.reverse.each do |f|
puts "#{f} β #{File.size(f)} bytes β #{File.mtime(f)}"
end
configs = Dir['config/**/*.yml']
FileUtils
require 'fileutils'
FileUtils.cp('source.txt', 'backup.txt')
FileUtils.cp_r('src_dir', 'dest_dir')
FileUtils.mv('old_name.txt', 'new_name.txt')
FileUtils.rm('temp.txt')
FileUtils.rm_rf('build/')
FileUtils.rm_f('maybe_exists.txt')
FileUtils.chmod(0o755, 'script.sh')
FileUtils.touch('marker.txt')
Pathname (Object-Oriented Paths)
require 'pathname'
path = Pathname.new('/home/user/projects/app/config.yml')
puts path.basename # config.yml
puts path.extname # .yml
puts path.dirname # /home/user/projects/app
puts path.parent # /home/user/projects/app
puts path.basename('.*') # config (without extension)
new_path = path.parent / 'database.yml'
puts new_path # /home/user/projects/app/database.yml
puts path.exist?
puts path.file?
puts path.directory?
puts path.absolute?
Pathname.new('.').children.each do |child|
puts "#{child} (#{child.file? ? 'file' : 'dir'})"
end
π Cross-Language Comparison
- Ruby Pathname β Python pathlib.Path β both provide OO path manipulation with
/operator. - Node.js path β functional API (
path.join(),path.resolve()). - Go filepath β similar functional approach:
filepath.Join(),filepath.Walk().
3 CSV Processing
Ruby's standard library includes the csv module for reading and writing CSV files. It supports headers, type conversion, and streaming for large files.
Reading CSV
require 'csv'
data = CSV.read('users.csv')
data.each { |row| puts row.inspect }
users = CSV.read('users.csv', headers: true)
users.each do |row|
puts "#{row['name']} β #{row['email']} (age: #{row['age']})"
end
CSV.foreach('large_data.csv', headers: true) do |row|
process(row) if row['status'] == 'active'
end
Writing CSV
CSV.open('output.csv', 'w') do |csv|
csv << ['name', 'email', 'age']
csv << ['Alice', 'alice@example.com', 28]
csv << ['Bob', 'bob@example.com', 32]
csv << ['Carol', 'carol@example.com', 25]
end
csv_string = CSV.generate do |csv|
csv << ['id', 'product', 'price']
csv << [1, 'Ruby Book', 29.99]
csv << [2, 'Keyboard', 89.99]
end
puts csv_string
CSV with Converters
CSV.foreach('data.csv', headers: true, converters: :numeric) do |row|
puts row['price'].class # Float instead of String
end
custom_converter = ->(value) { value == 'true' ? true : value == 'false' ? false : value }
CSV.foreach('flags.csv', headers: true, converters: [custom_converter]) do |row|
puts row['active'].class # TrueClass or FalseClass
end
CSV Transformation
input = CSV.read('input.csv', headers: true)
output = input.select { |row| row['age'].to_i >= 18 }
.sort_by { |row| row['name'] }
CSV.open('filtered.csv', 'w') do |csv|
csv << input.headers
output.each { |row| csv << row }
end
puts "Filtered #{input.size} β #{output.size} rows"
4 JSON Processing
Beyond parsing API responses (covered in Chapter 5), JSON is commonly used for configuration files and data interchange on disk.
Read & Write JSON Files
require 'json'
config = JSON.parse(File.read('config.json'), symbolize_names: true)
puts config[:database][:host]
data = {
app_name: 'MyApp',
version: '2.1.0',
database: {
host: 'localhost',
port: 5432,
name: 'myapp_production'
},
features: %w[auth logging cache]
}
File.write('config.json', JSON.pretty_generate(data))
File.write('compact.json', data.to_json)
Streaming Large JSON
File.open('records.jsonl', 'r') do |f|
f.each_line do |line|
record = JSON.parse(line, symbolize_names: true)
process_record(record)
end
end
File.open('output.jsonl', 'w') do |f|
records.each do |record|
f.puts record.to_json
end
end
π‘ JSON Lines (JSONL) Format
For large datasets, JSONL (one JSON object per line) is more memory-efficient than a single large JSON array. Each line can be parsed independently, making it suitable for streaming and log processing.
5 YAML Processing
YAML is Ruby's native configuration format β used extensively in Rails (database.yml, routes.yml) and many Ruby tools. The yaml module is part of the standard library.
Read & Write YAML
require 'yaml'
config = YAML.load_file('config.yml', permitted_classes: [Symbol])
puts config['database']['host']
puts config['database']['port']
config = {
'app' => {
'name' => 'MyApp',
'version' => '2.1.0',
'debug' => false
},
'database' => {
'adapter' => 'postgresql',
'host' => 'localhost',
'port' => 5432,
'database' => 'myapp_prod',
'pool' => 10
},
'redis' => {
'url' => 'redis://localhost:6379/0'
}
}
File.write('config.yml', YAML.dump(config))
Configuration Pattern with Environments
# config/database.yml
default: &default
adapter: postgresql
encoding: unicode
pool: 5
development:
<<: *default
database: myapp_dev
host: localhost
production:
<<: *default
database: myapp_prod
host: db.example.com
pool: 25
all_config = YAML.load_file('config/database.yml')
env = ENV.fetch('RACK_ENV', 'development')
db_config = all_config[env]
puts "Connecting to #{db_config['database']} on #{db_config['host']}"
puts "Pool size: #{db_config['pool']}"
ERB + YAML (Dynamic Config)
require 'yaml'
require 'erb'
template = File.read('config.yml.erb')
rendered = ERB.new(template).result
config = YAML.safe_load(rendered)
π YAML Security
- Use
YAML.safe_loadinstead ofYAML.loadwhen loading untrusted input β it restricts deserialization to basic types. - In Ruby 3.1+,
YAML.loadrequirespermitted_classesfor non-basic types. - Never load YAML from user input without
safe_loadβ arbitrary object instantiation is a serious vulnerability.
6 File Metadata
Ruby provides rich methods for inspecting file attributes β existence, type, size, timestamps, and permissions.
Existence & Type Checks
puts File.exist?('config.yml') # true/false
puts File.file?('config.yml') # true if regular file
puts File.directory?('lib') # true if directory
puts File.symlink?('link.txt') # true if symbolic link
puts File.readable?('secrets.yml') # true if readable
puts File.writable?('output.txt') # true if writable
puts File.executable?('script.sh') # true if executable
puts File.zero?('empty.txt') # true if zero-length
Size & Timestamps
puts File.size('data.csv') # bytes
puts File.mtime('app.rb') # last modified time
puts File.atime('app.rb') # last access time
puts File.ctime('app.rb') # last status change time
puts File.birthtime('app.rb') # creation time (macOS/Windows)
stat = File.stat('app.rb')
puts "Size: #{stat.size} bytes"
puts "Mode: #{stat.mode.to_s(8)}"
puts "Owner UID: #{stat.uid}"
Practical: Directory Size Calculator
def dir_size(path)
Dir.glob(File.join(path, '**', '*'))
.select { |f| File.file?(f) }
.sum { |f| File.size(f) }
end
def format_size(bytes)
units = %w[B KB MB GB TB]
return '0 B' if bytes.zero?
exp = (Math.log(bytes) / Math.log(1024)).to_i
exp = units.size - 1 if exp >= units.size
format('%.1f %s', bytes.to_f / (1024**exp), units[exp])
end
path = ARGV[0] || '.'
total = dir_size(path)
puts "Total size of #{path}: #{format_size(total)}"
Dir.glob(File.join(path, '*')).sort_by { |f| File.size(f) }.reverse.first(10).each do |f|
size = File.file?(f) ? File.size(f) : dir_size(f)
type = File.directory?(f) ? 'π' : 'π'
puts " #{type} #{format_size(size).rjust(10)} #{File.basename(f)}"
end
π Cross-Language Comparison
- Ruby File.exist? β Python os.path.exists() β Node.js fs.existsSync() β PHP file_exists()
- Ruby Dir.glob β Python glob.glob() β Node.js glob β PHP glob()
- Ruby FileUtils β Python shutil β Node.js fs-extra
7 Chapter Summary
π File Read/Write
File.read/File.write for simple operations, File.open with block for guaranteed cleanup.
π Directory Operations
Dir.glob for pattern matching, FileUtils for copy/move/delete, Pathname for OO paths.
π CSV Processing
CSV.read with headers, CSV.foreach for streaming, converters for automatic type casting.
π JSON Files
JSON.parse(File.read(...)) to load, JSON.pretty_generate for formatted output, JSONL for streaming.
βοΈ YAML Config
YAML.load_file for reading, YAML.dump for writing. Use safe_load for untrusted input.
π File Metadata
File.exist?, File.size, File.mtime, File.stat for inspecting file attributes and permissions.
Next Chapter Preview: Chapter 7 covers Ruby's ecosystem tools β Bundler deep dive, RSpec testing, Rake build automation, popular gems, and code quality tools like RuboCop.