Rust gRPC authentication service + cross-platform client
How Modern Type Systems Enable Infrastructure You Can Trust
How Modern Type Systems Enable Infrastructure You Can Trust
Authentication is the front door of your platform. Every request hits it, every incident starts or ends there, and every millisecond of latency multiplies across your entire product.
For years, backend engineers have faced painful trade-offs:
- Velocity vs. Stability — Move fast with dynamic languages like Node.js, or build for stability with Java/Go?
- Simplicity vs. Correctness — Write simple code that might fail at runtime, or complex code that is formally verified?
- Safety vs. Performance — Accept garbage collection pauses for memory safety, or risk manual memory management in C++?
Rust eliminates these compromises.
By combining Rust with Tonic (gRPC), SQLx (async SQL), and Tower (middleware), you get a stack where:
- Correctness is enforced by the compiler, not just unit tests
- Latency is predictable, with no garbage collector to cause spikes under load
- Security is type-driven, making entire classes of vulnerabilities unrepresentable
This article dissects a production-grade authentication service that demonstrates why infrastructure-critical services are increasingly being built in Rust. We explore architectural innovations that are only possible with Rust’s type system:
- Compile-time SQL verification catching schema mismatches before deployment
- Protocol multiplexing (gRPC + gRPC-Web + REST) behind one port and one middleware stack
- Atomic database operations preventing race conditions through PostgreSQL functions
- Zero-cost abstractions enabling tiny images (≈ 15 MB) with sub-10 ms P99 latencies
- Type-driven security making whole vulnerability classes harder (or impossible) to write
Note on numbers: The memory/latency/cost figures below are typical “order of magnitude” comparisons. Your results will depend on dependencies, runtime flags, database, traffic shape, and deployment environment.
What This System Delivers
- ✅ OWASP Authentication Cheat Sheet compliance
- ✅ Multi-device session tracking with geolocation
- ✅ Email/phone authentication with verification flows
- ✅ JWT tokens with automatic rotation and single-use refresh tokens
- ✅ Account lockout with configurable brute-force protection
- ✅ Compile-time guarantees for all database queries
- ✅ Full observability (OpenTelemetry, Prometheus, Sentry)
- ✅ Cloud-native deployment with multi-arch ARM64/AMD64 images
Live Demo: auth-demo.dmitrii.app
Source Code: Server (Rust) * Client (Flutter)
TL;DR (For Skimmers)
- Rust + Tonic + SQLx + Tower makes it practical to build an auth service that is fast, safe, and production-friendly
- One port serves gRPC, gRPC-Web, and a small REST surface through lightweight protocol detection at the edge
- Compile-time SQL validation catches schema drift early and supports an offline workflow for CI
- Atomic database functions in PostgreSQL keep security-critical flows race-free under concurrency
- Security controls align with OWASP guidance: generic errors, account lockout, strong password hashing, and token rotation
Who This Article Is For
- Backend engineers designing authentication or identity services (or migrating an existing one)
- Rust developers seeking a realistic reference architecture (not just a toy example)
- DevOps and SRE teams who value deployability: single-port services, small images, and predictable resource usage
- Security-minded teams looking for pragmatic OWASP-aligned defaults: password hashing, token rotation, account lockout, and generic errors
How to Read This (So You Don’t Get Lost)
Table of Contents
Part I: Why Rust Changes Everything
Part II: Architecture
Part III: Security & OWASP Compliance
Part IV: Authentication Flows
Part V: Production Operations
Part VI: Getting Started & Comparison
Wrap-Up
Part I: Why Rust Changes Everything
The Resource Efficiency Revolution
Authentication services are high-traffic gateways — every request touches them. They must be fast, secure, and economical to run at scale.
The Economics of Memory
| Runtime | Idle Memory | P99 Latency | Cost @ 100 Replicas |
| -------------------- | ----------- | ----------- | ------------------- |
| **Java/Spring Boot** | 250–500 MB | 50–200 ms | 25–50 GB requested |
| **Node.js/Express** | 70–150 MB | 20–80 ms | 7–15 GB requested |
| **Go/gRPC** | 20–40 MB | 5–20 ms | 2–4 GB requested |
| **Rust/Tonic** | **5–15 MB** | **2–10 ms** | **0.5–1.5 GB** |These ranges are illustrative, not guarantees. Baseline memory/latency depends heavily on framework choices, TLS, logging/metrics, connection pools, and workload. Use this as a mental model, then benchmark your own service.
In Kubernetes — where you often pay for requested resources — a smaller and more predictable baseline makes it easier to right-size requests and scale more granularly.
The real breakthrough isn’t just efficiency — it’s predictability. Without a garbage collector, there are no GC pauses — removing one common source of tail-latency spikes. P99 latencies remain consistently low under load, which is critical when every millisecond of authentication latency cascades across your platform.
Compile-Time SQL Verification
Most database bugs surface in production. A renamed column, a missing WHERE clause, or a type mismatch - each produces a runtime error that could have been prevented.
SQLx eliminates this entire class of bugs through compile-time verification.
How It Works
let user = sqlx::query_as!(
User,
"SELECT id, email FROM auth.users WHERE email = $1",
email
)
.fetch_optional(&pool)
.await?;During cargo build, SQLx:
- Connects to your database
- Parses and validates the SQL query
- Verifies column types match your Rust struct
- Fails compilation if anything doesn’t match
CI tip: SQLx also supports an offline workflow (via cargo sqlx prepare) so builds don't need a live database connection. This keeps CI deterministic while preserving compile-time validation.A renamed email column? The build breaks. This moves bugs from runtime (where they hurt users) to compile-time (where they only annoy developers).
The Impact
Before SQLx, you’d discover this in production:
ERROR: column "email" does not existWith SQLx, you discover it during cargo build:
error: no such column: `email` in table `users`
--> src/services/auth.rs:42:5Rust’s macro ecosystem makes this kind of schema-aware checking unusually ergonomic. SQLx’s online/offline workflow is one of the most seamless ways to get compile-time confidence in real SQL.
Zero-Cost Async & Type-Driven Security
Zero-Cost Abstractions
Rust compiles async/await into state machines at compile time. There are no runtime allocations and no scheduler overhead. Futures remain lightweight (often stack-resident until you spawn them), while Tokio provides the scheduling and I/O layer with low, predictable overhead:
#[tokio::main]
async fn main() -> anyhow::Result<()> {
let config = Config::init()?;
telemetry::init(&config.telemetry)?;
let (app, addr) = startup::build_app(&config).await?;
let listener = TcpListener::bind(addr).await?;
info!("Server listening on {addr}");
axum::serve(listener, app)
.with_graceful_shutdown(shutdown_signal())
.await?;
Ok(())
}Every await becomes a state transition in that generated state machine. In practice, the runtime cost is often close to what you'd write by hand in a lower-level language-while still preserving memory safety.
Type-Driven Security
Rust’s type system enforces security patterns:
- Secret Protection
use secrecy::SecretString;
pub struct Config {
jwt_secret: SecretString, // Can't accidentally log
db_password: SecretString, // Debug trait redacts value
}2. Lifetime Safety
// Compiler ensures database connections don't outlive transactions
pub async fn create_session<'a>(
pool: &'a PgPool,
params: CreateSessionParams<'a>, // Borrows from caller
) -> Result<Session, AppError>3. Thread-Safe Sharing
// Arc provides safe sharing without locks
let context = Arc::new(ServiceContext::new(db, email, s3));
let auth_service = AuthService::new(context.clone());
let user_service = UserService::new(context.clone());During development, the compiler caught timing attack vulnerabilities that would have been subtle runtime bugs in other languages. For example, accidentally cloning a token hash instead of referencing it would create a timing side-channel — Rust’s borrow checker flagged this immediately.
Part II: Architecture
Architecture Overview
The service follows a layered architecture with clear separation of concerns:

auth-service-rs/
├── src/ # Service binary (wiring, routing)
│ ├── main.rs # Entry point, telemetry setup
│ ├── startup.rs # Server wiring, dependency injection
│ ├── routes.rs # REST endpoints (health, verify-email)
│ ├── config.rs # Environment config with validation
│ ├── services/ # gRPC services implementations
│ ├── middleware/ # Tower layers (auth, tracing, IP)
│ └── core/ # Service context, URLs, password utils
├── crates/
│ ├── proto/ # Protobuf definitions + codegen
│ ├── core/ # Domain logic (JWT, tokens, password, validation)
│ ├── db/ # SQLx repositories, migrations, atomic SQL functions
│ ├── email/ # SMTP provider
│ ├── mailjet/ # Mailjet provider
│ ├── storage/ # S3-compatible storage
│ └── telemetry/ # OTel, Prometheus, Sentry glue
└── proto/ # Raw .proto files (auth.proto, users.proto)Key Architectural Principles
1. Single Responsibility via Crate Organization
The project uses Cargo workspace to separate concerns:
Benefits of this structure:
- Isolated compilation — Changes to
auth-emaildon't require recompilingauth-db - Testability — Each crate can be unit tested independently
- Reusability —
auth-corecan be imported by other Rust services - Clear boundaries — Dependencies flow inward: services → repositories → domain
2. Dependency Injection via ServiceContext
- All infrastructure dependencies (DB, email, storage) injected through
Arc<ServiceContext> - Services remain testable and infrastructure-agnostic
3. Compile-Time Verification
- SQL queries verified against database schema during build
- Proto validation rules embedded and checked at compilation
- Type-safe error conversions prevent runtime surprises
4. Zero-Trust Security
- JWT validation occurs at the middleware layer with O(1) public route lookup
- Every authenticated endpoint receives an
AuthInfoextension - Secrets are never logged thanks to the
SecretStringtype
5. Graceful Degradation
Protocol Multiplexing: One Port for Everything
Traditional microservices run gRPC and REST on separate ports, requiring:
The Innovation
This service handles gRPC, gRPC-Web, and REST on one port using Axum’s router composition:
pub async fn build_app(config: &Config) -> Result<(Router, SocketAddr)> {
// gRPC services
let grpc_routes = Routes::new(health_service)
.add_service(auth_service)
.add_service(user_service);
// REST routes
let rest_router = rest_routes(state);
// gRPC-Web layer enables browsers (HTTP/1.1)
let grpc_router = grpc_routes
.into_axum_router()
.layer(GrpcWebLayer::new());
let app = rest_router.merge(grpc_router).layer(middleware);
Ok((app, addr))
}Protocol Detection
Requests are routed based on the Content-Type header:
application/grpc→ native gRPC (mobile and desktop clients)application/grpc-web→ gRPC-Web (browser clients)application/json→ REST (health checks, webhooks, and callbacks)
The Impact
DevOps complexity: eliminated
- One port to expose
- One Kubernetes Service to configure
- One health check endpoint to monitor
- One TLS certificate to manage
- No Nginx or Envoy configuration required
Unified Tower Middleware
The service implements a unified middleware pipeline that works seamlessly for both gRPC and REST, eliminating the need for separate interceptor chains. This section covers two key innovations: perfect-hash route matching and the Tower middleware stack.
Perfect Hash Route Matching
Most auth middleware needs a “public route allowlist” (health checks, sign-up, recovery endpoints). Many implementations use HashMap or linear checks.
This service uses a compile-time perfect hash:
use phf::phf_set;
static PUBLIC_ROUTES: phf::Set<&'static str> = phf_set! {
// gRPC methods
"Authenticate", "SignUp", "RecoveryStart", "RefreshTokens",
// REST endpoints
"/health", "/metrics", "/verify-email",
};
fn is_public_route(path: &str) -> bool {
PUBLIC_ROUTES.contains(path) // O(1), zero allocation
}phf_set! generates a perfect hash at compile time:
Protocol-Aware Error Responses
The same auth layer returns the right “shape” of error for gRPC vs REST:
fn build_error_response(err: &JwtError, is_grpc: bool) -> Response<Body> {
if is_grpc {
// gRPC typically uses HTTP 200 and carries status in `grpc-status` trailers
Response::builder()
.status(StatusCode::OK)
.header("grpc-status", "16") // UNAUTHENTICATED
.header("grpc-message", err.to_string())
.body(Body::empty())
} else {
// REST uses standard HTTP status codes
Response::builder()
.status(StatusCode::UNAUTHORIZED)
.header("www-authenticate", "Bearer")
.body(json!({"error": err}).into())
}
}This enables one authentication layer for all protocols.
The Tower Middleware Stack
Here’s the complete middleware pipeline:
let middleware = ServiceBuilder::new()
.layer(RequestIdLayer::new()) // 1. Extract/generate request ID
.layer(TraceLayer::new_for_http()) // 2. Distributed tracing
.layer(TimeoutLayer::new(Duration::from_secs(10))) // 3. Request timeout
.layer(cors) // 4. CORS handling
.layer(AuthLayer::new(jwt_validator)); // 5. JWT authenticationKey insight: Tower’s middleware abstraction is protocol-agnostic — the same Layer trait works for gRPC and REST because both are HTTP-based.
Benefits of unified middleware:
- Single Implementation — Write authentication once; it applies everywhere
- Consistent Behavior — Same error handling, logging, and tracing for all protocols
- Composable Layers — Stack middleware like building blocks using Tower’s zero-cost abstraction
- Protocol-Aware Responses — Middleware detects
Content-Typeand formats errors appropriately
If you need gRPC-only behavior (custom metadata/trailers), Tonic interceptors are still available:
let server = AuthServiceServer::with_interceptor(
auth_service,
|req: Request<()>| {
if let Some(_custom) = req.metadata().get("x-custom") {
// gRPC-specific logic here
}
Ok(req)
}
);This architecture provides simplicity by default with power when needed — the best of both worlds.
Atomic Database Operations
Race conditions in authentication systems cause critical security bugs: double token use, account takeover, and privilege escalation. Traditional multi-query transactions leave windows where concurrent requests can corrupt state. This section demonstrates how to eliminate these vulnerabilities entirely.
PostgreSQL Functions
Instead of multi-step transactions in application code, critical operations are atomic database functions:
CREATE OR REPLACE FUNCTION auth.verify_email(p_token_hash BYTEA)
RETURNS auth.users
LANGUAGE plpgsql
AS $$
DECLARE
v_user_id UUID;
v_status auth.user_status;
BEGIN
-- Step 1: Consume token atomically
UPDATE auth.email_verification_tokens
SET used_at = now()
WHERE token_hash = p_token_hash
AND used_at IS NULL
AND expires_at > now()
RETURNING id_user INTO v_user_id;
IF v_user_id IS NULL THEN
RAISE EXCEPTION 'TOKEN_INVALID';
END IF;
-- Step 2: Validate user status
SELECT status INTO v_status
FROM auth.users WHERE id = v_user_id;
IF v_status IN ('suspended', 'deleted') THEN
RAISE EXCEPTION 'ACCOUNT_SUSPENDED';
END IF;
-- Step 3: Verify + activate
UPDATE auth.users
SET email_verified = TRUE,
status = CASE WHEN status = 'pending' THEN 'active' ELSE status END
WHERE id = v_user_id
RETURNING * INTO result;
RETURN result;
END;
$$;Why This Matters
Without atomic function (multi-query approach):
// ❌ Race condition possible between queries
let token = db.get_token(hash).await?;
if token.used_at.is_some() {
return Err(AppError::TokenInvalid);
}
db.mark_token_used(token.id).await?; // Another request could use token here
db.verify_user_email(token.user_id).await?;With atomic function:
// ✅ Single atomic operation
let user = db.verify_email(hash).await?;One call, one transaction, no race window.
Another Example: Sliding Session Expiration
Session Touch (Sliding Expiration)
CREATE FUNCTION auth.touch_session(p_token_hash BYTEA, p_extend_by INTERVAL)
RETURNS TABLE (id_user UUID, expires_at TIMESTAMPTZ)
AS $$
UPDATE auth.sessions
SET last_seen_at = now(),
expires_at = GREATEST(expires_at, now() + p_extend_by),
activity_count = activity_count + 1
WHERE refresh_token = p_token_hash AND expires_at > now()
RETURNING id_user, expires_at;
$$;Every token refresh atomically validates, updates activity, and extends expiration — preventing session hijacking through precise timing attacks.
Database Schema (Safety + Performance)
The schema leverages PostgreSQL-specific features for safety and performance:
1. UUIDv7 for Time-Ordered IDs
CREATE TABLE auth.users (
id UUID PRIMARY KEY DEFAULT uuidv7(), -- Time-ordered, index-friendly, PostgreSQL 18+
...
);- UUIDv7 combines timestamp prefix with random suffix
- Better B-tree index performance than UUIDv4 (locality of reference)
- Sortable by creation time without separate
created_atcolumn queries
2. Domain types for normalization + validation
CREATE DOMAIN email AS TEXT
CHECK (
VALUE = lower(VALUE) AND
VALUE ~ '^[^@\\s]+@[^@\\s]+\\.[^@\\s]+$' AND
length(VALUE) <= 254
);
CREATE DOMAIN phone_e164 AS TEXT
CHECK (VALUE ~ '^\\+[1-9][0-9]{1,14}$');- Enforces normalization: Email always lowercase
- Format validation: Regex patterns at database level
- Type safety:
emailandphone_e164are distinct types
3. Partial Unique Indexes (Soft Delete)
CREATE UNIQUE INDEX user_email_active_ux
ON auth.users (email)
WHERE deleted_at IS NULL AND email IS NOT NULL;- Allows duplicate emails for soft-deleted users
- Enforces uniqueness only for active accounts
- Enables user re-registration after account deletion
4. Constraint-Based Business Rules
CONSTRAINT user_auth_method_ck
CHECK (status != 'active' OR email IS NOT NULL OR phone IS NOT NULL)- Active users MUST have email or phone
- Database enforces business invariants
- Prevents invalid state at data layer
5. JSONB for Flexible Metadata
metadata JSONB DEFAULT '{}'::JSONB
CONSTRAINT role_permissions_valid_ck
CHECK (jsonb_typeof(permissions) = 'object')- Schema flexibility without migrations for new fields
- GIN indexes enable fast JSONB queries
- Type constraint ensures valid JSON structure
The ServiceContext Pattern
Rather than passing individual dependencies everywhere, a shared context encapsulates all infrastructure:
#[derive(Clone)]
pub struct ServiceContext {
db: Database,
email: Option<EmailProvider>, // SMTP or Mailjet
s3: Option<Arc<S3Storage>>,
urls: UrlBuilder,
}Benefits
- Single Source of Truth
// Services share infrastructure via Arc
let context = Arc::new(ServiceContext::new(db, email, s3, urls));
let auth_service = AuthService::new(config, context.clone());
let user_service = UserService::new(config, context.clone());2. Graceful Degradation
// Optional features don't break the service
impl ServiceContext {
pub fn send_email(&self, /* params */) {
let Some(email) = &self.email else {
return; // Email not configured — service still works
};
// Send email...
}
}3. Fire-and-Forget Background Tasks
pub fn send_welcome_email(&self, user_id: Uuid, email: String, /* ... */) {
let db = self.db.clone();
let email_provider = self.email.clone();
tokio::spawn(async move {
// Create verification token
// Send email
// Log result
}); // User response is sent immediately; email delivery happens in background
}Sign-up returns in milliseconds, while email delivery happens asynchronously. Failures are logged but never block users.
Embedded Validation with Protobuf
The service leverages compile-time validation through Protocol Buffers annotations. This approach catches invalid requests before they reach business logic, eliminates boilerplate validation code, and ensures consistent enforcement across all clients.
Validation Rules in Proto Files
message SignUpRequest {
string email = 1 [
(validate.rules).string.email = true,
(validate.rules).string.max_len = 255
];
string password = 2 [
(validate.rules).string.min_len = 8,
(validate.rules).string.max_len = 128
];
string name = 3 [
(validate.rules).string.min_len = 1,
(validate.rules).string.max_len = 255
];
}Automatic Code Generation
The prost-validate crate generates validation code at compile time:
// In crates/proto/build.rs
prost_validate_build::Builder::new()
.configure(&mut config, proto_files, includes)
.expect("Failed to configure prost-validate");Every generated message gets a Validator trait implementation. The service uses an extension trait for ergonomic validation:
pub trait ValidateExt {
fn validate_or_status(&self) -> Result<(), Status>;
}
impl<T: Validator> ValidateExt for T {
fn validate_or_status(&self) -> Result<(), Status> {
self.validate()
.map_err(|e| Status::invalid_argument(e.to_string()))
}
}Usage in Handlers
pub async fn sign_up(
&self,
request: Request<SignUpRequest>,
) -> Result<Response<AuthResponse>, Status> {
let req = request.into_inner();
// Proto validation — one line, all rules checked
req.validate_or_status()?;
// Domain validation — business rules beyond proto
domain::validate_password(&req.password)?;
// All validation passed — proceed with business logic
let user = self.ctx.db().create_user(&req).await?;
// ...
}Key benefits:
- Early Rejection — Invalid requests fail at the entry point, before any database queries execute
- Consistent Validation — The same rules are enforced across all clients: mobile, web, and CLI
- Self-Documenting API — Validation rules are visible in
.protofiles - Client Generation — Clients can generate the same validation logic from proto files
- Performance — Validation code compiles into the binary with zero runtime overhead
- Nested Validation — Complex message hierarchies are validated recursively
Domain-Specific Validation
For business rules that can’t be expressed in proto annotations, the service provides domain validators:
pub mod domain {
use tonic::Status;
/// Validate password complexity beyond length (which proto handles)
pub fn validate_password(password: &str) -> Result<(), Status> {
let has_letter = password.chars().any(char::is_alphabetic);
let has_digit = password.chars().any(|c| c.is_ascii_digit());
if !has_letter || !has_digit {
return Err(Status::invalid_argument(
"Password must contain both letters and numbers"
));
}
Ok(())
}
/// Validate E.164 phone format
pub fn validate_phone(phone: &str) -> Result<(), Status> {
if !phone.starts_with('+') || phone.len() < 8 || phone.len() > 16 {
return Err(Status::invalid_argument(
"Phone must be in E.164 format (+1234567890)"
));
}
Ok(())
}
}This two-tier validation approach (proto + domain) ensures comprehensive input validation without code duplication. Proto annotations handle structural constraints such as length, format, and required fields, while domain validators handle business logic like password complexity and semantic format validation.
Streaming APIs with async-stream
The service implements server-side streaming for efficient data transfer, especially useful for admin operations listing thousands of users.
Streaming Implementation
use async_stream::try_stream;
use tokio_stream::StreamExt;
pub fn list_users(&self, req: ListUsersRequest) -> StreamResult<User> {
let db = self.ctx.db().clone();
let stream = try_stream! {
let mut rows = db.users.stream_all_users();
while let Some(result) = rows.next().await {
let user = result.map_err(|e| {
error!(error = %e, "Failed to stream users");
Status::from(AppError::Unavailable(e.to_string()))
})?;
yield User::from(user); // Stream one record at a time
}
};
Box::pin(stream)
}Why Streaming Matters
Without streaming (traditional approach):
// ❌ Load all users into memory
let users = db.get_all_users().await?; // OOM with 100K users
Ok(Response::new(ListUsersResponse { users }))With streaming:
// ✅ Constant memory regardless of user count
let stream = self.list_users(req);
Ok(Response::new(Box::pin(stream)))Benefits:
- Constant Memory — Process millions of records with a fixed memory footprint
- Progressive Results — Clients receive data as it’s fetched, reducing time to first result
- Early Termination — Clients can close the stream when satisfied, saving server resources
- Backpressure — gRPC flow control prevents overwhelming slow clients
- Database Cursors — SQLx streams map directly to PostgreSQL cursors without buffering
Real-world impact — Admin dashboard listing 50,000 users:
- Non-streaming: 200 MB memory, 3-second wait for first result
- Streaming: 5 MB memory, 50 ms to first result, progressive rendering
The async_stream::try_stream! macro provides ergonomic syntax for building streams with ? operator support - much cleaner than manual Stream trait implementation.
Part III: Security & OWASP Compliance
Multi-Layer Security Architecture
The service implements the OWASP Authentication Cheat Sheet through multiple complementary layers:
Layer 1: Password Security
Argon2id Hashing
OWASP currently recommends Argon2id and provides baseline parameters you can start from (then calibrate on your hardware). A common starting point is ~19 MiB memory, 2 iterations, parallelism 1. (Source: OWASP Password Storage Cheat Sheet.)
/// Hash password using Argon2id with OWASP-recommended parameters
pub fn hash_password(password: &str) -> Result<String, AppError> {
let salt = SaltString::generate(&mut OsRng);
// OWASP baseline: 19 MiB memory, 2 iterations, 1 parallelism
let params = Params::new(19 * 1024, 2, 1, None)
.map_err(|e| AppError::internal(format!("argon2 params: {e}")))?;
let argon2 = Argon2::new(Algorithm::Argon2id, Version::V0x13, params);
argon2
.hash_password(password.as_bytes(), &salt)
.map(|hash| hash.to_string())
.map_err(|e| AppError::internal(format!("argon2 hash: {e}")))
}
/// Verify password against stored hash (constant-time comparison)
pub fn verify_password(password: &str, hash: &str) -> bool {
use password_hash::PasswordVerifier;
let parsed_hash = match password_hash::PasswordHash::new(hash) {
Ok(h) => h,
Err(_) => return false,
};
Argon2::default()
.verify_password(password.as_bytes(), &parsed_hash)
.is_ok()
}Argon2id is memory-hard — resistant to GPU/ASIC attacks. Cost parameters ensure cracking attempts are prohibitively expensive. The stored hash includes algorithm, parameters, and salt, making future parameter upgrades transparent.
Layer 2: Account Lockout
To prevent brute-force attacks, the service implements progressive account lockout:
pub async fn authenticate(&self, req: AuthenticateRequest) -> Result<AuthResponse> {
let user = self.find_user(&req.identifier).await?;
// Check if account locked
if let Some(locked_until) = user.locked_until {
if locked_until > Utc::now() {
return Ok(AuthResponse::locked());
}
}
// Verify password
if !password::verify(&req.password, &user.password_hash) {
let attempts = self.db.increment_failed_login(user.id).await?;
// Lock account after N failures
if attempts >= self.config.max_failed_attempts {
self.db.lock_account(user.id, self.config.lockout_duration).await?;
return Ok(AuthResponse::locked());
}
return Ok(AuthResponse::failed());
}
// Success: reset counter
self.db.reset_failed_login(user.id).await?;
Ok(self.create_session(user, req).await?)
}Configuration via environment:
MAX_FAILED_LOGIN_ATTEMPTS=5LOCKOUT_DURATION_MINUTES=15
Layer 3: Generic Error Messages
OWASP requirement: Never reveal whether an email exists in the system.
fn failed_auth() -> AuthResponse {
AuthResponse {
status: AuthStatus::Failed,
message: "Invalid credentials".to_string(), // Generic message
// No indication whether email exists
}
}Same response for:
- Email doesn’t exist
- Password is wrong
- Account is pending or suspended
This prevents user enumeration attacks, where attackers attempt to harvest valid email addresses.
Layer 4: JWT Token Security
Cached Validator with Strong Configuration
pub struct JwtValidator {
encoding_key: Arc<EncodingKey>, // Cached for performance
decoding_key: Arc<DecodingKey>,
validation: Validation,
}
impl JwtValidator {
pub fn new(secret: &SecretString) -> Self {
let mut validation = Validation::new(Algorithm::HS256);
validation.set_audience(&["auth-service"]);
validation.set_issuer(&["auth-service"]);
validation.validate_exp = true; // Expiration required
validation.validate_nbf = true; // Not-before required
Self {
encoding_key: Arc::new(EncodingKey::from_secret(secret.as_bytes())),
decoding_key: Arc::new(DecodingKey::from_secret(secret.as_bytes())),
validation,
}
}
}Token Claims
struct Claims {
sub: String, // user_id
email: String,
name: String,
role: String,
device_id: String,
installation_id: String,
iat: i64, // Issued at
exp: i64, // Expiration
nbf: i64, // Not before
aud: String, // Audience
iss: String, // Issuer
}Layer 5: Refresh Token Rotation
Single-Use Tokens
pub async fn refresh_tokens(&self, refresh_token: String) -> Result<TokenPair> {
let token_hash = TokenGenerator::hash_token(&refresh_token);
// Atomically validate and consume token
let session = self.db.touch_session(&token_hash).await?;
// Generate new tokens
let new_access = self.jwt.generate_access_token(&session.user, ...)?;
let new_refresh = TokenGenerator::generate_secure_token();
let new_hash = TokenGenerator::hash_token(&new_refresh);
// Store new refresh token
self.db.update_session_token(session.id, &new_hash).await?;
Ok(TokenPair { access_token: new_access, refresh_token: new_refresh })
}Key property: If an attacker steals a refresh token, they can only use it once before it’s replaced.
Layer 6: OAuth 2.0 Readiness
The service architecture is prepared for OAuth 2.0 social login, with the database schema and proto definitions already in place.
Database schema supports OAuth providers:
CREATE TYPE oauth_provider AS ENUM (
'google', 'github', 'microsoft', 'apple', 'facebook'
);
CREATE TABLE oauth_accounts (
id UUID PRIMARY KEY DEFAULT uuidv7(),
id_user UUID NOT NULL REFERENCES users(id),
provider oauth_provider NOT NULL,
provider_id TEXT NOT NULL, -- Provider's user ID
provider_email email,
metadata JSONB DEFAULT '{}'::JSONB,
linked_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (provider, provider_id) -- One account per provider
);Proto definitions ready:
service AuthService {
rpc GetOAuthUrl(GetOAuthUrlRequest) returns (GetOAuthUrlResponse);
rpc ExchangeOAuthCode(ExchangeOAuthCodeRequest) returns (AuthResponse);
rpc LinkOAuthProvider(LinkOAuthProviderRequest) returns (Empty);
rpc UnlinkOAuthProvider(UnlinkOAuthProviderRequest) returns (Empty);
rpc ListLinkedProviders(ListLinkedProvidersRequest) returns (ListLinkedProvidersResponse);
}Why this architecture matters:
- Schema Stability — Database supports OAuth from day one (no migration needed later)
- API Contract — Proto definitions let clients prepare for OAuth before implementation
- Graceful Degradation — Service works without OAuth; easy to add when needed
- Account Linking — Users can link multiple OAuth providers to one account
- Passwordless Accounts — Schema allows OAuth-only users (no password set)
Implementation path: When OAuth is needed, implementers only need to:
- Add OAuth client libraries (for example, the
oauth2crate) - Implement the handler logic for redirect URLs and token exchange
- Remove the
unimplementedstatus
No database migrations and no API changes are required — the architecture anticipated this need.
Token Refresh Flow
The service implements sliding session expiration with single-use refresh tokens, as illustrated in the following sequence:

Security Properties:
- Single-Use Refresh Tokens — Each refresh operation invalidates the previous token
- Sliding Expiration — Active sessions remain alive for up to 90 days
- Atomic Token Rotation — A database function prevents race conditions
- Activity Tracking — The
activity_countincrements andlast_seen_atupdates with each refresh - Forced Re-authentication — Required after 90 days of inactivity or manual session revocation
Implementation Detail: touch_session() uses GREATEST()
expires_at = GREATEST(expires_at, NOW() + $extend_by)This ensures:
- Long-term expiration (90 days) is never shortened
- Recent activity extends expiration within that window
- Sessions eventually expire even with regular use (security best practice)
Type-Safe Error Handling
Rust’s Result type forces explicit error handling at every step. This project uses a structured approach that ensures errors are handled consistently:
Domain Errors with thiserror
#[derive(Debug, Error)]
pub enum AppError {
#[error("Not found: {0}")]
NotFound(String),
#[error("Unauthenticated: {0}")]
Unauthenticated(String),
#[error("Permission denied: {0}")]
PermissionDenied(String),
#[error("Conflict: {0}")]
Conflict(String),
#[error("Internal: {0}")]
Internal(String),
}Automatic gRPC Conversion
impl From<AppError> for Status {
fn from(error: AppError) -> Self {
match error {
AppError::NotFound(msg) => Status::not_found(msg),
AppError::Unauthenticated(msg) => Status::unauthenticated(msg),
AppError::Conflict(msg) => Status::already_exists(msg),
AppError::Internal(msg) => {
error!("Internal error: {}", msg); // Log details
Status::internal("Internal server error") // Generic to client
}
}
}
}Critical: Internal errors are logged but never exposed to clients — preventing information leakage.
Extension Traits for Ergonomics
To simplify error handling in service methods, the project provides extension traits:
pub trait StatusExt<T> {
fn status(self, msg: &'static str) -> Result<T, Status>;
}
impl<T, E: Display> StatusExt<T> for Result<T, E> {
fn status(self, msg: &'static str) -> Result<T, Status> {
self.map_err(|e| {
error!("{}: {}", msg, e);
Status::internal(msg)
})
}
}Usage in handlers:
let user = self.db.get_user(id)
.await
.status("Failed to get user")?; // Logs error, returns generic StatusThe ? operator propagates errors up the call stack, with automatic conversion to Status at the gRPC boundary.
Session Management and Device Tracking
The service tracks sessions with rich device context, enabling users to manage their active sessions across multiple devices:
pub struct SessionInfo {
pub device_id: String, // Unique device identifier
pub device_name: String, // "iPhone 15 Pro"
pub device_type: String, // mobile/tablet/desktop/web
pub client_version: String, // App version
pub ip_address: IpNetwork, // Last seen IP
pub ip_country: String, // ISO country code (GeoIP)
pub created_at: DateTime<Utc>,
pub last_seen_at: DateTime<Utc>,
pub expires_at: DateTime<Utc>,
pub activity_count: i32, // Number of token refreshes
pub is_current: bool, // Is this the calling device?
}Session Model
User Experience Features Enabled
1. List All Sessions
rpc ListSessions(ListSessionsRequest) returns (ListSessionsResponse);Users can view all active devices, including their locations and last activities.
2. Revoke Specific Sessions
rpc RevokeSession(RevokeSessionRequest) returns (Empty);Allows actions like “Log out from iPhone” — revoking only that device’s session.
3. Revoke All Other Sessions
rpc RevokeOtherSessions(RevokeOtherSessionsRequest) returns (Empty);Enables “Log out everywhere else” — keeping only the current device logged in.
Database Schema
CREATE TABLE auth.sessions (
id UUID PRIMARY KEY DEFAULT uuidv7(),
id_user UUID NOT NULL REFERENCES auth.users(id),
refresh_token BYTEA NOT NULL, -- SHA-256 hash
device_id VARCHAR(255),
device_name VARCHAR(255),
device_type VARCHAR(50),
client_version VARCHAR(100),
ip_address INET,
ip_country VARCHAR(2), -- ISO country code
created_at TIMESTAMPTZ DEFAULT now(),
last_seen_at TIMESTAMPTZ DEFAULT now(),
expires_at TIMESTAMPTZ NOT NULL,
activity_count INT DEFAULT 0,
metadata JSONB DEFAULT '{}'
);
-- Unique constraint: one session per user+device
CREATE UNIQUE INDEX session_user_device_ux
ON auth.sessions (id_user, device_id)
WHERE device_id IS NOT NULL;Key insight: The unique constraint on (id_user, device_id) ensures that refreshing tokens updates the existing session rather than creating duplicates.
Part IV: Authentication Flows
Authentication Flow
The authentication flow implements OWASP best practices, including account lockout, generic error messages, and secure session management:

Security Highlights
Account Lockout — Progressive penalties prevent brute-force attacks:
- Failed attempts are tracked per user
- The threshold is configurable (default: 5 attempts)
- Temporary lockout applies (default: 15 minutes)
- Lockout information is returned with a
retry_aftervalue in seconds
Generic Error Messages — Prevent user enumeration:
Device-Based Sessions — Multi-device support:
- Each device receives a separate session tracked by
device_id - Sessions include: IP address, country, device type, client version, and last activity
- Users can list and revoke individual devices
Password Recovery Flow
Password recovery implements OWASP recommendations for secure reset workflows:

Security Features
1. No User Enumeration (OWASP Critical)
RecoveryStartalways returns success, even for non-existent emails- Background processing prevents timing attacks
- No indication whether email exists in system
2. Atomic Token Management
- Single database query atomically invalidates old tokens and creates new one
- Prevents race condition where multiple tokens could be valid
- CTE (Common Table Expression) ensures transactional integrity
3. Single-Use Tokens
used_attimestamp marks token as consumed- Database query filters
WHERE used_at IS NULL - Atomic UPDATE with RETURNING prevents double-use
4. Forced Session Revocation
- All existing sessions terminated after password reset
- Ensures compromised account is fully secured
- User must log in again on all devices
5. Notification Email
- User receives confirmation after password change
- Alerts to unauthorized reset attempts
- Includes timestamp and IP information (via GeoIP)
Email Verification Flow
Email verification supports dual paths to accommodate different client types:

Path Comparison
| Feature | gRPC (Primary) | REST (Fallback) |
| ------------------- | ------------------------- | --------------------------- |
| **Client** | Mobile, Desktop, Web SPA | Email clients, old browsers |
| **Auto-Login** | ✅ Yes (returns tokens) | ❌ No (redirect only) |
| **User Experience** | Seamless (one-click done) | Manual login required |
| **Deep Linking** | ✅ App handles URL | ❌ Browser navigation |
| **Implementation** | Same DB function | Same DB function |
| **Security** | JWT tokens via gRPC | Redirect to frontend |Key Implementation Details
1. Atomic Database Function auth.verify_email()
CREATE OR REPLACE FUNCTION auth.verify_email(p_token_hash BYTEA)
RETURNS auth.users AS $$
DECLARE
v_user_id UUID;
v_status auth.user_status;
BEGIN
-- Step 1: Consume token atomically
UPDATE auth.email_verification_tokens
SET used_at = now()
WHERE token_hash = p_token_hash
AND used_at IS NULL
AND expires_at > now()
RETURNING id_user INTO v_user_id;
IF v_user_id IS NULL THEN
RAISE EXCEPTION 'TOKEN_INVALID';
END IF;
-- Step 2: Check account status
SELECT status INTO v_status FROM auth.users WHERE id = v_user_id;
IF v_status IN ('suspended', 'deleted') THEN
RAISE EXCEPTION 'ACCOUNT_SUSPENDED';
END IF;
-- Step 3: Verify email and activate
UPDATE auth.users
SET email_verified = TRUE,
status = CASE WHEN status = 'pending' THEN 'active' ELSE status END
WHERE id = v_user_id
RETURNING * INTO result;
RETURN result;
END;
$$ LANGUAGE plpgsql;2. Graceful Client Fallback
- Modern clients intercept verification URLs and call gRPC for auto-login
- Email clients clicking links hit REST endpoint for graceful redirect
- Same backend logic ensures consistency
3. Token Security
- 256-bit random tokens (43-character URL-safe base64)
- SHA-256 hashed before storage (prevents token database leaks)
- Single-use enforcement via
used_atcolumn - 24-hour expiration (configurable)
Part V: Production Operations
Observability from Day One
Production systems need visibility. This service integrates four complementary layers of observability:
Layer 1: Structured Logging (tracing)
Every request receives a trace ID that follows it through all operations. The tracing crate provides structured, machine-parseable logs:
use tracing::{info, error, instrument};
#[instrument(skip(self), fields(user_id))]
pub async fn authenticate(&self, req: AuthenticateRequest) -> Result<AuthResponse> {
let user = self.find_user(&req.identifier).await?;
// Record user_id in span for all subsequent logs
Span::current().record("user_id", user.id.to_string());
info!("User authenticated successfully");
Ok(self.create_session(user, req).await?)
}Every request gets a trace ID that follows it through all operations. Optionally, logs are output as structured JSON for easy ingestion by monitoring systems:
{
"timestamp": "2026-01-23T15:42:17.123Z",
"level": "INFO",
"message": "User authenticated successfully",
"request_id": "01JQBX...",
"user_id": "550e8400-e29b-41d4-a716-446655440000",
"span": "authenticate"
}Layer 2: Prometheus Metrics
use metrics::{counter, histogram};
pub async fn authenticate(&self, req: AuthenticateRequest) -> Result<AuthResponse> {
let start = Instant::now();
let result = self.authenticate_impl(req).await;
histogram!("auth_duration_ms").record(start.elapsed().as_millis() as f64);
if result.is_ok() {
counter!("auth_success_total").increment(1);
} else {
counter!("auth_failure_total").increment(1);
}
result
}Metrics exposed at /metrics:
auth_duration_ms- Authentication latency histogramauth_success_total- Successful authenticationsauth_failure_total- Failed authenticationsdb_query_duration_ms- Database query latenciesgrpc_request_total- Request counts by methodhttp_requests_total- REST endpoint hitsactive_sessions- Current active session count
Example Prometheus queries:
# P99 authentication latency
histogram_quantile(0.99, rate(auth_duration_ms_bucket[5m]))
# Authentication failure rate
rate(auth_failure_total[5m]) / rate(auth_success_total[5m])
# Database query errors
rate(db_query_errors_total[5m])Layer 3: Distributed Tracing (OpenTelemetry)
pub fn setup_telemetry(config: &TelemetryConfig) -> TelemetryGuard {
// Initialize OpenTelemetry exporter
let tracer = opentelemetry_otlp::new_pipeline()
.tracing()
.with_exporter(
opentelemetry_otlp::new_exporter()
.tonic()
.with_endpoint(&config.otlp_endpoint)
)
.install_batch(opentelemetry_sdk::runtime::Tokio)?;
// Combine with console logging
tracing_subscriber::registry()
.with(tracing_opentelemetry::layer().with_tracer(tracer))
.with(fmt::layer().json())
.init();
TelemetryGuard { tracer }
}Sends traces to Jaeger, Tempo, or any OTLP collector for distributed request tracing across services.
Layer 4: Error Tracking (Sentry)
#[cfg(feature = "sentry")]
pub fn init_sentry(dsn: &str) -> ClientInitGuard {
sentry::init((dsn, sentry::ClientOptions {
release: Some(env!("CARGO_PKG_VERSION").into()),
environment: Some(std::env::var("ENVIRONMENT").unwrap_or_else(|_| "development".into()).into()),
..Default::default()
}))
}Automatic error capture with context:
- Stack traces
- Request metadata
- User IDs (when authenticated)
- Environment information
Cloud-Native Deployment
Docker: Optimized for Size and Security
The Dockerfile uses a multi-stage build with static linking to produce a minimal, secure image:
# Build stage
FROM rust:1.93-alpine AS build
RUN apk add --no-cache musl-dev protobuf-dev ca-certificates
COPY . /app
WORKDIR /app
RUN cargo build --release --locked && \
cp target/release/auth-service /service
# Runtime stage
FROM scratch
COPY --from=build /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=build /service /service
ENV RUST_LOG=info
EXPOSE 8080
ENTRYPOINT ["/service"]Result: 15–25 MB image with a minimal OS-level attack surface (no shell, no package manager, no distro userland).
Multi-Architecture CI/CD: Native Builds at Scale
The Challenge: Cross-compilation with QEMU is painfully slow, often taking 30+ minutes for Rust projects. Most CI/CD pipelines suffer from this bottleneck.
The Solution: Native builds on platform-specific runners, where each architecture builds on its own hardware.
name: Build & Publish multi-arch image
jobs:
# ARM64: Native build on ARM runner (GitHub hosted)
build-arm64:
runs-on: ubuntu-24.04-arm # ARM64 runner
steps:
- uses: docker/build-push-action@v6
with:
platforms: linux/arm64
push: true
push-by-digest: true # Push by digest for manifest merge
cache-from: type=gha,scope=arm64
cache-to: type=gha,scope=arm64,mode=max
# AMD64: Native build on x86 runner
build-amd64:
runs-on: ubuntu-latest # AMD64 runner
steps:
- uses: docker/build-push-action@v6
with:
platforms: linux/amd64
push: true
push-by-digest: true
cache-from: type=gha,scope=amd64
# Merge: Create multi-arch manifest
manifest:
needs: [build-arm64, build-amd64]
steps:
- name: Create and push manifest
run: |
docker buildx imagetools create \
-t ghcr.io/org/auth-service:latest \
ghcr.io/org/auth-service@sha256:$ARM64_DIGEST \
ghcr.io/org/auth-service@sha256:$AMD64_DIGESTPerformance comparison:
| Approach | Build Time | Cost (GitHub Actions) |
| -------------------------- | -------------- | -------------------------------- |
| **QEMU cross-compilation** | 35-45 min | High (single long-running job) |
| **Native parallel builds** | 4-6 min | Low (parallel, shorter duration) |
| **Speedup** | **~8x faster** | **60% cost reduction** |Key innovations:
- Parallel Execution — Both architectures build simultaneously
- Layer Caching — GitHub Actions cache (
type=gha) persists Cargo dependencies across builds - Digest-Based Push — Each build pushes by digest; the manifest job merges them
- SBOM and Provenance — Security metadata is included automatically
Aggressive Build Optimizations
The Cargo.toml profile maximizes release binary optimization:
[profile.release]
opt-level = 3 # Maximum optimization
lto = true # Link-time optimization (whole-program)
codegen-units = 1 # Single codegen unit for better optimization
strip = true # Strip symbols (reduces binary size 30%)Impact on binary size:
| Configuration | Binary Size | Image Size |
| ------------------------- | ----------- | ---------- |
| Debug build | 180 MB | N/A |
| Release (no LTO) | 45 MB | 50 MB |
| Release (with LTO) | 28 MB | 32 MB |
| **Release (LTO + strip)** | **12 MB** | **15 MB** |LTO (Link-Time Optimization) enables:
- Cross-crate inlining, where functions from dependencies get inlined
- Dead code elimination across the entire program
- More aggressive constant propagation
Trade-off: Longer compile time (5–7 minutes) for 60% smaller binaries. In production, smaller images mean:
- Faster container pulls
- Lower storage costs
- Faster cold starts (Cloud Run, Lambda)
- Reduced attack surface (fewer symbols to exploit)
The Dockerfile uses static musl linking for a truly standalone binary:
RUN cargo build --release --locked && \
cp target/release/auth-service /service
# Runtime: scratch (0 bytes base)
FROM scratch
COPY --from=build /service /service
ENTRYPOINT ["/service"]Result: 15MB self-contained image with zero dependencies. Compare to:
- Java Spring Boot: 250–400 MB (JRE + dependencies)
- Node.js: 150–200 MB (Node runtime + node_modules)
- Go: 20–30 MB (static binary)
- Rust (this project): 15 MB (static binary, no runtime)
Google Cloud Run Deployment
One-command deployment:
gcloud run deploy auth-service \
--image ghcr.io/zs-dima/auth-service-rs:latest \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--set-env-vars PORT=8080 \
--set-secrets=JWT_SECRET_KEY=jwt-secret:latest,DB_PASSWORD=db-password:latestCloud Run automatically:
- Scales to zero when idle (pay per request)
- Handles TLS certificates
- Provides load balancing
- Injects the
PORTenvironment variable
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: auth-service
spec:
replicas: 3
template:
spec:
containers:
- name: auth-service
image: ghcr.io/zs-dima/auth-service-rs:latest
resources:
requests:
memory: "32Mi" # Rust's small footprint
cpu: "50m"
limits:
memory: "128Mi"
cpu: "500m"
env:
- name: PORT
value: "8080"
- name: JWT_SECRET_KEY
valueFrom:
secretKeyRef:
name: auth-secrets
key: jwt-secret
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5Note the resource requirements: 32 MB request, 128 MB limit. Compare this to Java’s typical 512 MB-2 GB requirements.
The Cross-Platform Client
The Flutter client (auth-app) demonstrates one codebase for all platforms: Android, iOS, Windows, macOS, Linux, and Web.
Platform-Aware gRPC Channel
ClientChannelBase getChannel(String host, int port) {
if (kIsWeb) {
// Web: Use XHR-based gRPC-Web
return GrpcWebClientChannel.xhr(
Uri.parse('https://$host:$port'),
);
} else {
// Native: Use HTTP/2 channel
return ClientChannel(
host,
port: port,
options: ChannelOptions(
credentials: ChannelCredentials.secure(),
),
);
}
}Key insight: The server’s protocol multiplexing enables this — browsers use gRPC-Web supporting server-side streaming, native apps use native gRPC, same backend.
Automatic Token Injection
class AuthInterceptor extends ClientInterceptor {
final TokenStorage _tokens;
@override
ResponseFuture<R> interceptUnary<Q, R>(
ClientMethod<Q, R> method,
Q request,
CallOptions options,
ClientUnaryInvoker<Q, R> invoker,
) {
final token = _tokens.accessToken;
if (token != null) {
options = options.mergedWith(
CallOptions(metadata: {'authorization': 'Bearer $token'}),
);
}
return invoker(method, request, options);
}
}Every gRPC call automatically includes the access token, eliminating manual header management.
Part VI: Getting Started & Comparison
Rust vs. Traditional Stacks
Having implemented authentication services in Go, Node.js, and Rust, here are the concrete improvements that Rust delivers:
1. Compile-Time Correctness vs Runtime Surprises
Go/Node.js: Database schema changes discovered in production
// Go: Compiles fine, fails at runtime if column renamed
user, err := db.Query("SELECT id, email FROM users WHERE email = ?", email)
// ERROR: column "email" does not exist (discovered in production!)Rust: Database schema changes discovered during cargo build
// Rust: Fails compilation if column doesn't exist
let user = sqlx::query_as!(User, "SELECT id, email FROM users WHERE email = $1", email)
// error: no such column: `email` in table `users` (caught at compile time!)Impact: Eliminated an entire class of database-related production incidents. Schema migrations now require corresponding code changes before deployment.
2. Memory Safety Without Runtime Overhead
Go: Garbage collector pauses cause latency spikes
- P99 latency: 15–30ms (GC pressure under load)
- Memory overhead: 20–40MB baseline + GC metadata
- Unpredictable GC pauses during traffic spikes
Node.js: Worse GC characteristics
- P99 latency: 30–80ms (V8 GC)
- Memory overhead: 70–150MB baseline
- Full GC pauses can exceed 100ms
Rust: No garbage collector
- P99 latency: 2–10ms (consistent)
- Memory overhead: 5–15MB baseline
- Zero GC pauses — latency remains stable under load
Real-world metric: Authentication endpoint at 10,000 req/s:
- Go: 99th percentile 25ms, 99.9th 120ms (GC spikes)
- Rust: 99th percentile 8ms, 99.9th 12ms (stable)
3. Type Safety Prevents Vulnerability Classes
Go/Node.js: String manipulation errors cause security issues
// Go: Easy to accidentally log secrets
log.Printf("Config: %+v", config) // Oops, logged JWT secret!
// Node.js: JSON serialization exposes secrets
res.json(config) // Sent JWT secret to client!Rust: Compiler enforces secret protection
use secrecy::SecretString;
pub struct Config {
jwt_secret: SecretString, // Debug trait redacts value
}
println!("{:?}", config); // Output: Config { jwt_secret: "***" }Using SecretString makes it harder to accidentally expose secrets through logging or serialization.
4. Zero-Cost Abstractions Enable Better Architecture
Go: Interface indirection has runtime cost
type Repository interface {
GetUser(id string) (*User, error)
}
// Every call goes through vtable — measurable overhead at scale
user, err := repo.GetUser(id)Rust: Trait dispatch optimized away at compile time
trait Repository {
async fn get_user(&self, id: Uuid) -> Result<User, AppError>;
}
// Monomorphization means direct call — zero overhead
let user = repo.get_user(id).await?;This enables layered architectures (repository → service → handler) without performance penalties. Rust’s design philosophy: abstractions should not cost.
5. Dependency Management and Build Reproducibility
Node.js: node_modules chaos
- 300+ MB for typical auth service
- Transitive dependency hell (leftpad incident)
npm auditreports 50+ vulnerabilities regularly
Go: Better, but still issues
- Vendor directory bloat
- Version conflicts in transitive deps
- No built-in vulnerability scanning
Rust: Cargo ecosystem
Cargo.lockensures reproducible buildscargo auditchecks for CVEs automaticallycargo-denyenforces license and security policies- Compile-time dependency resolution
6. Fearless Refactoring
Go/Node.js: Refactoring requires extensive testing
- Rename a field? You might miss JSON tags, database columns, or API responses
- Change a function signature? It could break a caller five layers deep
- You need runtime tests to catch everything
Rust: The compiler catches breaking changes
- Rename a struct field? The compiler finds all uses
- Change a function signature? The compiler errors at every call site
- Missing error handling? The compiler forces
?or.unwrap()
Real experience: Refactoring session management from ID-based to hash-based tokens:
- Rust: 2 hours — compiler guided all changes
- Go (previous implementation): 2 days — runtime testing discovered edge cases
7. Cross-Platform Deployment
Node.js: Different behavior across platforms
- Native modules require rebuilding for each platform
- Electron/pkg bundling introduces compatibility issues
- macOS vs. Linux path handling differs
Rust: True “compile once, run anywhere”
- Static musl binary works on any Linux distribution
- 15 MB Docker image from
scratchwith no OS dependencies - ARM64 and AMD64 builds from the same source with zero code changes
Performance Summary Table
| Metric | Node.js (Express) | Go (gRPC) | **Rust (Tonic)** |
| ----------------------- | ----------------- | --------- | ---------------- |
| **Idle Memory** | 70-150 MB | 20-40 MB | **5-15 MB** |
| **P99 Latency** | 30-80 ms | 15-30 ms | **2-10 ms** |
| **Throughput (1 core)** | 5k req/s | 15k req/s | **25k req/s** |
| **Docker Image** | 200-400 MB | 30-50 MB | **15-25 MB** |
| **Cold Start** | 200-500 ms | 50-150 ms | **10-30 ms** |
| **Memory Safety** | Runtime (V8) | Runtime | **Compile-time** |
| **SQL Verification** | Runtime | Runtime | **Compile-time** |Cost Implications at Scale
For a service handling 10 million authentications per day:
Node.js deployment (AWS ECS):
- 20 replicas × 512 MB = 10 GB RAM requested
- Cost: ~$300/month (memory) + ~$100/month (CPU)
- Total: $400/month
Go deployment (AWS ECS):
- 10 replicas × 128 MB = 1.28 GB RAM requested
- Cost: ~$80/month (memory) + ~$60/month (CPU)
- Total: $140/month
Rust deployment (AWS ECS):
- 5 replicas × 32 MB = 160 MB RAM requested
- Cost: ~$15/month (memory) + ~$30/month (CPU)
- Total: $45/month
Rust is 9x cheaper than Node.js, 3x cheaper than Go at the same traffic level.
What Makes This Implementation Unique?
This authentication service showcases several architectural innovations rarely seen together:
1. Protocol Multiplexing on a Single Port
- gRPC, gRPC-Web, and REST coexist seamlessly
- No reverse proxy configuration required
- Content-Type detection routes requests automatically
- Eliminates port management complexity in Kubernetes
2. Dual-Path Email Verification
- Modern clients use gRPC with auto-login for the best user experience
- Legacy and email clients use REST with redirect for graceful fallback
- The same atomic database function ensures consistency
- Demonstrates pragmatic protocol flexibility
3. Compile-Time Database Guarantees
- SQLx verifies queries against the live database during build
- Column renames break compilation, not production
- Type-safe models prevent runtime serialization errors
- Migration safety is ensured through schema validation
4. Atomic Operations Prevent Race Conditions
- Password reset: invalidate old token and create new one in a single CTE
- Email verification: validate, consume, and activate atomically
- Token refresh: touch, extend, and rotate in one function
- The database enforces correctness, not application code
5. Native Multi-Architecture CI/CD
- Parallel ARM64/AMD64 builds on native runners
- 10x faster than QEMU emulation
- Manifest merge creates a unified multi-platform image
- Demonstrates modern cloud-native practices
6. Zero-Trust Security by Design
- Perfect-hash public routes provide O(1) lookup
- JWT validation is cached once and used everywhere
- Secrets are never logged thanks to the
SecretStringtype - Generic error messages prevent enumeration attacks (OWASP)
7. Graceful Degradation
- Email service is optional — signup works without it
- S3 is optional — service runs without avatar support
- GeoIP is optional — sessions are created without location data
- Infrastructure dependencies don’t break core functionality
8. Observable from Day One
- Structured logging with JSON format and trace IDs
- Prometheus metrics with histograms
- OpenTelemetry spans for distributed tracing
- Sentry error capture with context
- Monitoring is built in, not an afterthought
Monitoring and Observability
Prometheus metrics at /metrics:
# Authentication requests per second
rate(auth_requests_total[5m])
# P99 latency
histogram_quantile(0.99, rate(auth_duration_seconds_bucket[5m]))
# Error rate
rate(auth_errors_total[5m]) / rate(auth_requests_total[5m])
# Database connection pool usage
db_connections_active / db_connections_maxOpenTelemetry tracing:
- Configure
OTLP_ENDPOINTto send traces to Jaeger, Tempo, or any OTLP collector - Each request gets a trace ID that follows through all operations
- Distributed tracing across microservices
Sentry error tracking:
- Set
SENTRY_DSNto capture unhandled errors - Includes stack traces, request metadata, user context
- Automatic grouping and alerting
Troubleshooting
“failed to verify access token: no ‘kid’ in token header””database connection failed”
- Check the
DB_URLformat:postgres://user:password@host:port/database - Ensure PostgreSQL is running and accessible
- Verify that firewall rules allow the connection
“failed to compile protos”
- The protobuf compiler is missing:
brew install protobuf(macOS) orapt install protobuf-compiler(Linux) - Alternatively, use the Docker build, which includes all dependencies
“SQLx query verification failed”
- The database is not running or is empty
- Run migrations:
make db-migrate - Generate offline data:
make db-prepare
Email not sending
- Check that
EMAIL_PROVIDERis set correctly (smtpormailjet) - Verify SMTP credentials or Mailjet API keys
- The service works without email, which is useful for local development
Testing and CI/CD
The service implements compile-time validation at multiple layers:
Compile-Time Guarantees
SQLx Query Verification — SQLx validates every query against the database during build:
// This fails at compile-time if 'email' column doesn't exist or types mismatch
let user = sqlx::query_as!(User,
"SELECT id, email, password FROM auth.users WHERE email = $1",
email
).fetch_optional(&pool).await?;What SQLx catches at compile time:
- Column name typos (
emialvsemail) - Type mismatches (expecting
String, column isi32) - Missing columns in SELECT
- Invalid SQL syntax
- Enum value mismatches
# Generate offline query data for CI (no DB needed at build time)
cargo sqlx prepare
# Verify queries match schema
cargo sqlx prepare --checkAdditional compile-time checks:
- Proto validation rules embedded at codegen
- Clippy pedantic lints (
#![warn(clippy::pedantic)]) - Zero
unsafecode enforced (#![forbid(unsafe_code)])
CI/CD: Native Multi-Architecture Builds
Traditional Docker multi-platform builds use QEMU emulation (10x slower). This project uses native ARM64 runners:

| Build Method | ARM64 | AMD64 | Total |
| ------------------ | ------- | ----- | ------ |
| **Native runners** | 4 min | 3 min | 7 min |
| **QEMU emulation** | 40 min | 3 min | 43 min |
| **Speedup** | **10x** | Same | **6x** |Pre-commit workflow:
make pre-commit # Runs: fmt, clippy, test, sqlx-checkProduction Readiness Checklist
The following table summarizes the key production-ready features of this service:
| Category | Highlights |
| ----------------- | ------------------------------------------------------------------------------------------------ |
| **Security** | OWASP compliance, Argon2id hashing, JWT rotation, account lockout, generic errors, zero `unsafe` |
| **Scalability** | Stateless (JWT), connection pooling, streaming APIs, atomic DB operations |
| **Observability** | OpenTelemetry, Prometheus `/metrics`, Sentry, structured JSON logs |
| **Reliability** | Compile-time SQL, graceful shutdown, timeout middleware, retry logic |
| **Deployment** | 15-25 MB images, multi-arch (ARM64/AMD64), Cloud Run & K8s ready |Roadmap
| Feature | Status |
| ---------------------- | ---------------- |
| OAuth 2.0 social login | 🚧 In development|
| MFA / TOTP | 📋 Planned |
| Passkeys / WebAuthn | 📋 Planned |
| Rate limiting | 📋 Planned |Want to contribute? PRs welcome for email providers (SendGrid, AWS SES), SMS providers (Twilio), Helm charts, and Terraform modules.
Conclusion
Building authentication services in Rust isn’t just about performance — it’s about confidence. The compiler catches mistakes that would become production incidents in other languages:
- Database schema mismatches are detected at compile time
- Secrets are protected from accidental logging
- Memory safety is guaranteed without garbage collector pauses
- Type-driven design prevents entire classes of vulnerabilities
Key Takeaways
- Compile-Time Guarantees Win — SQL queries are validated at compile time, and protocol validation is embedded at build time. Entire classes of bugs are eliminated before deployment.
- Zero-Cost Abstractions Are Real — Tower middleware, async/await, and perfect-hash functions compile to code as efficient as hand-written C.
- Memory Safety Equals Production Reliability — No null pointer exceptions, no use-after-free, and no data races.
- Infrastructure Services Deserve Infrastructure Languages — Authentication is the critical path for every request.
This service demonstrates that infrastructure-grade reliability is achievable with:
- 5–15 MB Docker images
- 2–10 ms P99 latencies
- Zero-cost abstractions enabling clean architecture
- Compile-time verification of database queries
- Full observability from day one
The Rust ecosystem has matured to the point where building production services is not just feasible — it’s enjoyable. Tonic, SQLx, Tower, and Tokio form a cohesive stack that rivals any language for backend development while providing guarantees that others cannot match.
The future of backend development is compiled, type-safe, and fast. The future is Rust.
Next Steps
🚀 Try It
auth-service is open-source and production-ready. Go build something amazing.
- ⚡ Live Demo — See the auth flow in action
- ⭐ Star on GitHub Service & Client — Support the project
- 🐳 Get Docker Image — Deploy in seconds
- 📘 Read Documentation — Integration guide & API reference
📚 Further Reading
- The Rust Book — Start here for Rust fundamentals
- Flutter Documentation — Flutter deep dive
- Tokio Tutorial — Async Rust deep dive
- SQLx Documentation — Compile-time SQL verification
- Tonic Guide — gRPC in Rust
- OWASP Auth Cheat Sheet
🤝 Let’s Connect
I’m a software engineer building high-performance systems. If you have questions or want to clarify details, feel free to reach out.
Blog · GitHub · LinkedIn · X / Twitter · Email
with ❤️ Dmitrii Zusmanovich