What is Code Surgery?
Real surgery has three elements:
- Incision/Excision - Cut into tissue, remove the lesion
- Hemostasis/Suture - Stop bleeding, stitch the wound closed
- Reconstruction - Rebuild what was lost
Code Surgery follows the same pattern:
| Surgery | Code Surgery |
|---|---|
| Incision | Grammar defines what to cut |
| Excision | Transformer removes/replaces the target |
| Suture | Gaps preserved - unmatched code stays intact |
| Reconstruction | Generate new code from schemas and specs |
Unlike regex or shell scripts that make rough cuts, Code Surgery understands structure. It's the difference between a scalpel and a machete.
Quick Example
Rename print() to console_log():
| grammar transformer source result |
grammar := Grammar from: '
call: @func FUNC "(" @args ARGS ")"
FUNC: /[a-zA-Z_][a-zA-Z0-9_]*/
ARGS: /[^)]*/
%ignore /\s+/
'.
transformer := Transformer new grammar: grammar.
transformer rule: 'call' do: [:m |
| func args |
func := m at: 'func'.
args := m at: 'args'.
(func = 'print')
ifTrue: [ 'console_log(' , args , ')' ]
ifFalse: [ func , '(' , args , ')' ]
].
source := 'x = print(hello); y = other(world); z = print(foo)'.
result := transformer transform: source.
"Result: x = console_log(hello); y = other(world); z = console_log(foo)"
Comments, strings, and unmatched code remain untouched.
The Problem with Regex
// Regex: rename "user" to "account"
s/user/account/g
// Result: BROKEN
"username" → "accountname" // wrong!
"userAgent" → "accountAgent" // wrong!
// comment about user → ... // wrong!
The Solution
// Code Surgery: rename variable "user" to "account"
"username" → "username" // unchanged
"userAgent" → "userAgent" // unchanged
// comment → // comment // unchanged
user.login() → account.login() // correct!
Real-World Scenario: Callback to Async/Await
Your company has 3000 files with callback-style async code:
function loadUser(id, callback) {
db.query("SELECT * FROM users WHERE id = ?", [id], function(err, rows) {
if (err) {
callback(err, null);
return;
}
callback(null, rows[0]);
});
}
You need to migrate to async/await:
async function loadUser(id) {
const rows = await db.query("SELECT * FROM users WHERE id = ?", [id]);
return rows[0];
}
This is not a simple text replacement. The entire structure changes:
- Callback parameter removed
- Function becomes
async - Nested callback becomes
await - Error handling changes to try/catch
- Return style completely different
The Surgery:
| grammar transformer |
grammar := Grammar from: '
callback_func: "function" @name NAME "(" @params PARAMS "," "callback" ")" @body BODY
NAME: /[a-zA-Z_][a-zA-Z0-9_]*/
PARAMS: /[^,)]*/
BODY: /{[^}]*}/
%ignore /\s+/
'.
transformer := Transformer new grammar: grammar.
transformer rule: 'callback_func' do: [:m |
| name params body |
name := m at: 'name'.
params := m at: 'params'.
body := m at: 'body'.
'async function ' , name , '(' , params , ') ' , body
].
Without Code Surgery:
- Create Excel checklist (3000 rows)
- Assign 15 developers
- 2 weeks of manual rewrites
- Code review finds inconsistencies
- Bugs in production from missed edge cases
With Code Surgery:
- Write grammar + transformer (2 hours)
- Run script (30 seconds)
- Review diff, fix edge cases
- Done in a day
SQL Dialect Migration
Migrating from MySQL to PostgreSQL across 500 stored procedures:
Before (MySQL):
SELECT * FROM users LIMIT 10, 20;
IFNULL(name, 'Unknown')
DATE_FORMAT(created_at, '%Y-%m-%d')
After (PostgreSQL):
SELECT * FROM users LIMIT 20 OFFSET 10;
COALESCE(name, 'Unknown')
TO_CHAR(created_at, 'YYYY-MM-DD')
The Surgery:
| grammar source |
grammar := Grammar from: '
limit: "LIMIT" " " @offset NUM "," " " @count NUM
NUM: /[0-9]+/
'.
source := 'SELECT * FROM users LIMIT 10, 20;'.
Grammar replace: grammar in: source with: [:m |
| offset count |
offset := m at: 'offset'.
count := m at: 'count'.
'LIMIT ' , count , ' OFFSET ' , offset
].
"Result: SELECT * FROM users LIMIT 20 OFFSET 10;"
Regex cannot reliably handle:
LIMIT offset, count→LIMIT count OFFSET offset(argument reordering)- Function names inside string literals (must not change)
- Nested function calls
Code Surgery parses the SQL, understands the AST, and transforms correctly.
Adding await to API Calls
Before:
/* comment */ obj.fetch() and client.send() /* end */
After:
/* comment */ await obj.fetch() and await client.send() /* end */
The Surgery:
| grammar source result |
grammar := Grammar from: '
dotcall: RECV "." METHOD "()"
RECV: /[a-z]+/
METHOD: /[a-z]+/
%ignore /./
'.
source := '/* comment */ obj.fetch() and client.send() /* end */'.
result := Grammar replace: grammar in: source with: [:m |
'await ' , (m at: 'text')
].
result printNl.
Comments are preserved. Only obj.fetch() and client.send() are wrapped.
Why Not Existing Tools?
| Tool | Limitation |
|---|---|
| sed/awk | Text-only, no syntax understanding |
| Regex | Breaks on edge cases, modifies strings/comments |
| Semgrep | Limited to supported languages |
| jscodeshift | JavaScript only |
| Refaster | Java only |
Lambda Smalltalk: Define your own grammar for any language or format.
Code Generation
Code Surgery isn't just about transformation. You can generate new code from schemas, specifications, or documents.
PostgreSQL Schema → Entity Classes
Connect directly to your database and generate type-safe entities:
Module import: 'CodeGen'. "For snakeToPascal and other utilities"
| conn schema |
"Connect to PostgreSQL and fetch schema"
conn := Postgres connect: 'host=localhost dbname=myapp'.
schema := conn query: '
SELECT table_name, column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_schema = ''public''
ORDER BY table_name, ordinal_position
'.
"Group by table and generate Rust structs"
(schema groupBy: [:row | row at: 'table_name'])
keysAndValuesDo: [:table :columns |
| code data fields |
"Build fields array"
fields := columns collect: [:c |
#{
'name' -> (c at: 'column_name').
'rust_type' -> (self pgToRust: (c at: 'data_type')
nullable: (c at: 'is_nullable') = 'YES')
}
].
"Build template data"
data := #{
'struct_name' -> table snakeToPascal.
'fields' -> fields
}.
code := Template render: '
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct {{struct_name}} {
{{#fields}}
pub {{name}}: {{rust_type}},
{{/fields}}
}
' with: data.
File write: ('src/entities/' , table , '.rs') content: code.
('Generated: ' , table , '.rs') printNl.
].
conn close.
Output (users.rs):
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Users {
pub id: i64,
pub name: String,
pub email: String,
pub created_at: Option<DateTime<Utc>>,
}
No ORM, no code generator tool, no configuration files. Just connect, query, generate.
CSV Specification → TypeScript Interfaces
Many teams maintain data definitions in spreadsheets. Export to CSV and generate code:
Module import: 'CodeGen'.
| rows entities |
"Parse CSV and group by entity name"
rows := Csv parse: (File read: 'docs/entities.csv').
entities := rows groupBy: [:row | row at: 'entity'].
"Generate TypeScript interfaces"
entities keysAndValuesDo: [:name :fieldRows |
| code data fields |
fields := fieldRows collect: [:f |
| d |
d := Dict new.
d at: 'field' put: (f at: 'field').
d at: 'tsType' put: (self csvToTs: (f at: 'type')).
d at: 'required' put: ((f at: 'required') = 'Y').
d
].
data := Dict new.
data at: 'name' put: name.
data at: 'fields' put: fields.
code := Template render: '
export interface {{name}} {
{{#fields}}
{{field}}{{^required}}?{{/required}}: {{tsType}};
{{/fields}}
}
' with: data.
File write: ('src/types/' , name , '.ts') content: code.
].
CSV Input (entities.csv):
entity,field,type,required
User,id,integer,Y
User,name,string,Y
User,email,string,Y
User,phone,string,N
Output (User.ts):
export interface User {
id: number;
name: string;
email: string;
phone?: string;
}
OpenAPI Spec → API Client
| spec client |
"Parse OpenAPI YAML"
spec := Yaml parse: (File read: 'api/openapi.yaml').
"Generate client methods for each endpoint"
client := Template render: '
export class ApiClient {
constructor(private baseUrl: string) {}
{{#endpoints}}
async {{method}}{{operationId}}({{params}}): Promise<{{responseType}}> {
const response = await fetch(
`${this.baseUrl}{{path}}`,
{ method: "{{httpMethod}}" }
);
return response.json();
}
{{/endpoints}}
}
' with: (self extractEndpoints: spec).
File write: 'src/api/client.ts' content: client.
Why Generate Code?
| Manual Approach | Code Generation |
|---|---|
| Copy-paste from spec | Single source of truth |
| Typos and inconsistencies | 100% accurate |
| Spec changes = manual updates | Re-run script |
| Hours of tedious work | Seconds |
The master surgeon doesn't just transform code—they generate it from the source of truth.
Batch Processing
Process all files in a directory:
| files grammar |
grammar := Grammar from: '...'.
files := File glob: 'src/**/*.js'.
files do: [:path |
| content result |
content := File read: path.
result := Grammar replace: grammar in: content with: [:m | ... ].
File write: path content: result.
('Processed: ' , path) printNl.
].
Beyond Code: Data and Network Surgery
The master surgeon operates on more than just code. Data streams, cloud resources, and network protocols are all surgical targets.
AWS S3 Without CLI
Access S3 directly using HTTP and AWS Signature V4. No CLI installation, no SDK dependencies:
| accessKey secretKey region bucket |
accessKey := Env at: 'AWS_ACCESS_KEY_ID'.
secretKey := Env at: 'AWS_SECRET_ACCESS_KEY'.
region := 'us-east-1'.
bucket := 'my-bucket'.
"Build AWS Signature V4"
| date timestamp scope signingKey signature headers |
date := DateTime now format: '%Y%m%d'.
timestamp := DateTime now format: '%Y%m%dT%H%M%SZ'.
"Derive signing key"
signingKey := Hmac sha256: date key: ('AWS4' , secretKey).
signingKey := Hmac sha256: region key: signingKey.
signingKey := Hmac sha256: 's3' key: signingKey.
signingKey := Hmac sha256: 'aws4_request' key: signingKey.
"Create canonical request and sign"
| host url stringToSign |
host := bucket , '.s3.' , region , '.amazonaws.com'.
url := 'https://' , host , '/data/export.csv'.
stringToSign := 'AWS4-HMAC-SHA256\n' , timestamp , '\n' ,
date , '/' , region , '/s3/aws4_request\n' ,
(Sha2 sha256: canonicalRequest).
signature := Hmac sha256: stringToSign key: signingKey.
"Make request with signed headers"
headers := #{
'Authorization' -> ('AWS4-HMAC-SHA256 Credential=' , accessKey , '/' , date ,
'/' , region , '/s3/aws4_request, SignedHeaders=host;x-amz-date, Signature=' , signature).
'x-amz-date' -> timestamp.
'Host' -> host
}.
| response |
response := Http get: url headers: headers.
response printNl.
Why this matters:
- Deploy to minimal environments (Alpine, scratch containers)
- No Python/Node runtime needed for AWS access
- Single binary handles everything
Data Pipeline: CSV → Transform → API
Process data files and push to external services:
| rows transformed |
"Read and parse CSV"
rows := Csv parse: (File read: 'data/users.csv').
"Transform: filter active users, normalize emails"
transformed := (rows select: [:row | (row at: 'status') = 'active'])
collect: [:row |
#{
'id' -> (row at: 'id').
'email' -> (row at: 'email') asLowercase.
'name' -> (row at: 'name').
'imported_at' -> (DateTime now format: '%Y-%m-%dT%H:%M:%SZ')
}
].
"Push to API in batches"
| headers |
headers := #{
'Content-Type' -> 'application/json'.
'Authorization' -> ('Bearer ' , (Env at: 'API_TOKEN'))
}.
(transformed chunks: 100) do: [:batch |
| payload response |
payload := Json generate: batch.
response := Http post: 'https://api.example.com/users/import'
body: payload
headers: headers.
('Imported batch: ' , response) printNl.
].
('Total processed: ' , transformed size asString) printNl.
The pipeline:
CSV File → Parse → Filter → Transform → JSON → HTTP POST → Done
No Pandas. No data engineering framework. Just precise cuts.
TCP: Raw HTTP Client
When you need to understand what's actually happening on the wire:
| sock request response |
"Connect to server"
sock := Tcp connect: 'httpbin.org' port: 80.
"Build raw HTTP request"
request := 'GET /json HTTP/1.1\r\n',
'Host: httpbin.org\r\n',
'User-Agent: Lambda-Smalltalk/1.0\r\n',
'Accept: application/json\r\n',
'Connection: close\r\n',
'\r\n'.
"Send and receive"
sock send: request.
response := sock recv: 4096.
sock close.
"Parse response"
| lines headers body |
lines := response asString lines.
headers := lines copyFrom: 1 to: (lines indexOf: '').
body := lines copyFrom: (lines indexOf: '') + 1 to: lines size.
('Status: ' , (lines at: 1)) printNl.
('Body: ' , (body join: '\n')) printNl.
Or build a Redis client:
| redis |
redis := Tcp connect: 'localhost' port: 6379.
"PING"
redis send: '*1\r\n$4\r\nPING\r\n'.
(redis recvLine) printNl. "=> +PONG"
"SET key value"
redis send: '*3\r\n$3\r\nSET\r\n$5\r\nmykey\r\n$7\r\nmyvalue\r\n'.
(redis recvLine) printNl. "=> +OK"
"GET key"
redis send: '*2\r\n$3\r\nGET\r\n$5\r\nmykey\r\n'.
(redis recvLine) printNl. "=> $7"
(redis recvLine) printNl. "=> myvalue"
redis close.
Why raw TCP?
- Debug protocol issues
- Implement custom protocols
- No library dependencies
- Learn how things actually work
Summary
Tools used on this page:
| Class | Purpose |
|---|---|
| Grammar | Pattern definition, syntax-aware parsing |
| Transformer | Rule-based code transformation |
| Template | Code generation with Mustache templates |
| Json / Yaml / Csv | Data format reading and writing |
| Http / Tcp | Network communication |
| File | File I/O, glob search |