Multi-Language Code Examples

Zad Rafi

Multi-Language Code Examples

statistics

programming

tutorial

Examples of using Stata, Julia, and pseudocode in blog posts

Author

Zad Rafi

Published

January 6, 2026

Introduction

This post demonstrates how to include Stata, Julia, and pseudocode in your statistical posts on Less Likely.

Stata Examples

Stata code can be included for syntax highlighting:

* ============================================
* Example: Linear Regression Analysis
* ============================================

* Load example dataset
sysuse auto, clear

* Descriptive statistics
summarize price mpg weight foreign

* Create a scatter plot
scatter price mpg, title("Price vs. MPG")

* Run linear regression
regress price mpg weight foreign

* Store estimates
estimates store model1

* Display results with confidence intervals
regress price mpg weight foreign, level(95)

* Test joint significance
test mpg weight

* Predict fitted values
predict price_hat

* Calculate residuals
predict residuals, residuals

* Plot residuals
scatter residuals price_hat, ///
  yline(0) ///
  title("Residual Plot")

Advanced Stata: Panel Data Analysis

* ============================================
* Panel Data Example
* ============================================

* Load panel dataset
webuse nlswork, clear

* Declare panel structure
xtset idcode year

* Summary statistics by panel
xtsum ln_wage age

* Fixed effects regression
xtreg ln_wage age tenure, fe

* Random effects regression
xtreg ln_wage age tenure, re

* Hausman test
hausman fe re

Julia Examples

Julia code for statistical computing:

# ============================================
# Example: Linear Regression in Julia
# ============================================

using DataFrames, Statistics, GLM, Plots

# Create sample data
n = 100
x = randn(n)
y = 2.0 .+ 3.0 .* x .+ randn(n) .* 0.5

# Create DataFrame
df = DataFrame(x = x, y = y)

# Fit linear model
model = lm(@formula(y ~ x), df)

# Display results
println(model)

# Plot data and fitted line
scatter(x, y, label="Data", alpha=0.6)
plot!(x, predict(model), label="Fitted Line", linewidth=2)
xlabel!("X")
ylabel!("Y")
title!("Linear Regression")

Advanced Julia: Bootstrap Confidence Intervals

# ============================================
# Bootstrap Example
# ============================================

using Distributions, StatsBase

# Define a function to bootstrap
function bootstrap_mean(data, n_bootstrap=1000)
    n = length(data)
    bootstrap_means = zeros(n_bootstrap)

    for i in 1:n_bootstrap
        # Resample with replacement
        sample = rand(data, n)
        bootstrap_means[i] = mean(sample)
    end

    return bootstrap_means
end

# Generate sample data
data = randn(100) .+ 5.0

# Perform bootstrap
bootstrap_samples = bootstrap_mean(data)

# Calculate confidence interval
ci_lower = quantile(bootstrap_samples, 0.025)
ci_upper = quantile(bootstrap_samples, 0.975)

println("95% Bootstrap CI: [", ci_lower, ", ", ci_upper, "]")

# Plot bootstrap distribution
histogram(bootstrap_samples,
         bins=30,
         label="Bootstrap Distribution",
         xlabel="Sample Mean",
         ylabel="Frequency")
vline!([ci_lower, ci_upper],
       label="95% CI",
       linewidth=2,
       color=:red)

Pseudocode Examples

Example 1: Algorithm Block

Algorithm 1 Bootstrap Confidence Interval

Input: Data X = \{x_1, x_2, ..., x_n\}, statistic \theta(X), confidence level \alpha

Output: Confidence interval [\theta_L, \theta_U]

for b = 1 to B do
- Draw bootstrap sample X_b^* by sampling n observations from X with replacement
- Calculate \theta_b^* = \theta(X_b^*)
end for
Sort \{\theta_1^*, \theta_2^*, ..., \theta_B^*\} in ascending order
Set \theta_L = \text{quantile}(\theta^*, \alpha/2)
Set \theta_U = \text{quantile}(\theta^*, 1-\alpha/2)
return [\theta_L, \theta_U]

Example 2: Simple Pseudocode Block

ALGORITHM: Linear Regression via Gradient Descent
INPUT: X (n × p design matrix), y (n × 1 response vector), learning rate α
OUTPUT: β (p × 1 coefficient vector)

1. Initialize β = 0, iteration = 0, maxiter = 1000
2. REPEAT until convergence OR iteration > maxiter:
   a. Compute predictions: ŷ = Xβ
   b. Compute residuals: r = y - ŷ
   c. Compute gradient: ∇L = -2X'r/n
   d. Update parameters: β = β - α∇L
   e. iteration = iteration + 1
3. RETURN β

Example 3: Detailed Algorithm with Comments

Algorithm 2 Maximum Likelihood Estimation

Input: Data X, log-likelihood function \ell(\theta|X), initial value \theta_0

Output: \hat{\theta}_{MLE}

Set \theta \leftarrow \theta_0, t \leftarrow 0
while not converged do
- Compute score function: s(\theta) = \nabla_\theta \ell(\theta|X)
- Compute Hessian: H(\theta) = \nabla^2_\theta \ell(\theta|X)
- Update: \theta \leftarrow \theta - H(\theta)^{-1}s(\theta)
- Check convergence: if \|s(\theta)\| < \epsilon then break
- t \leftarrow t + 1
- if t > \text{max\_iterations} then break
end while
Compute standard errors: \text{SE}(\hat{\theta}) = \sqrt{\text{diag}(-H(\hat{\theta})^{-1})}
return \hat{\theta}_{MLE}

Comparing Approaches

Here’s how the same analysis looks across languages:

Task: Calculate Mean and 95% CI

data <- rnorm(100, mean = 5, sd = 2)
mean_val <- mean(data)
ci <- t.test(data)$conf.int

Julia:

using Statistics, HypothesisTests
data = randn(100) .* 2 .+ 5
mean_val = mean(data)
ci = confint(OneSampleTTest(data))

Stata:

clear
set obs 100
generate data = rnormal(5, 2)
mean data

Python:

import numpy as np
from scipy import stats
data = np.random.normal(5, 2, 100)
mean_val = np.mean(data)
ci = stats.t.interval(0.95, len(data)-1,
                      loc=mean_val,
                      scale=stats.sem(data))

Best Practices

Use syntax highlighting for display-only code - Most readers just need to see the code
Execute code sparingly - Only execute when output is essential
Document your setup - Note required packages and versions
Test before publishing - Ensure code runs without errors
Provide context - Explain what the code does and why

Resources

Conclusion

With support for multiple languages, you can now share statistical methods and algorithms in the language that best suits your analysis and audience.

Backlinks (4)

--- title: "Multi-Language Code Examples" author: "Zad Rafi" date: "2026-01-06" description: "Examples of using Stata, Julia, and pseudocode in blog posts" categories: [statistics, programming, tutorial] draft: false format: html: toc: true code-fold: false code-tools: true code-copy: true execute: warning: false message: false eval: false --- ## Introduction This post demonstrates how to include Stata, Julia, and pseudocode in your statistical posts on Less Likely. ## Stata Examples Stata code can be included for syntax highlighting: ```stata * ============================================ * Example: Linear Regression Analysis * ============================================ * Load example dataset sysuse auto, clear * Descriptive statistics summarize price mpg weight foreign * Create a scatter plot scatter price mpg, title("Price vs. MPG") * Run linear regression regress price mpg weight foreign * Store estimates estimates store model1 * Display results with confidence intervals regress price mpg weight foreign, level(95) * Test joint significance test mpg weight * Predict fitted values predict price_hat * Calculate residuals predict residuals, residuals * Plot residuals scatter residuals price_hat, /// yline(0) /// title("Residual Plot") ``` ### Advanced Stata: Panel Data Analysis ```stata * ============================================ * Panel Data Example * ============================================ * Load panel dataset webuse nlswork, clear * Declare panel structure xtset idcode year * Summary statistics by panel xtsum ln_wage age * Fixed effects regression xtreg ln_wage age tenure, fe * Random effects regression xtreg ln_wage age tenure, re * Hausman test hausman fe re ``` ## Julia Examples Julia code for statistical computing: ```julia # ============================================ # Example: Linear Regression in Julia # ============================================ using DataFrames, Statistics, GLM, Plots # Create sample data n = 100 x = randn(n) y = 2.0 .+ 3.0 .* x .+ randn(n) .* 0.5 # Create DataFrame df = DataFrame(x = x, y = y) # Fit linear model model = lm(@formula(y ~ x), df) # Display results println(model) # Plot data and fitted line scatter(x, y, label="Data", alpha=0.6) plot!(x, predict(model), label="Fitted Line", linewidth=2) xlabel!("X") ylabel!("Y") title!("Linear Regression") ``` ### Advanced Julia: Bootstrap Confidence Intervals ```julia # ============================================ # Bootstrap Example # ============================================ using Distributions, StatsBase # Define a function to bootstrap function bootstrap_mean(data, n_bootstrap=1000) n = length(data) bootstrap_means = zeros(n_bootstrap) for i in 1:n_bootstrap # Resample with replacement sample = rand(data, n) bootstrap_means[i] = mean(sample) end return bootstrap_means end # Generate sample data data = randn(100) .+ 5.0 # Perform bootstrap bootstrap_samples = bootstrap_mean(data) # Calculate confidence interval ci_lower = quantile(bootstrap_samples, 0.025) ci_upper = quantile(bootstrap_samples, 0.975) println("95% Bootstrap CI: [", ci_lower, ", ", ci_upper, "]") # Plot bootstrap distribution histogram(bootstrap_samples, bins=30, label="Bootstrap Distribution", xlabel="Sample Mean", ylabel="Frequency") vline!([ci_lower, ci_upper], label="95% CI", linewidth=2, color=:red) ``` ## Pseudocode Examples ### Example 1: Algorithm Block ::: {.algorithm} **Algorithm 1** Bootstrap Confidence Interval **Input:** Data $X = \{x_1, x_2, ..., x_n\}$, statistic $\theta(X)$, confidence level $\alpha$ **Output:** Confidence interval $[\theta_L, \theta_U]$ 1. **for** $b = 1$ **to** $B$ **do** - Draw bootstrap sample $X_b^*$ by sampling $n$ observations from $X$ with replacement - Calculate $\theta_b^* = \theta(X_b^*)$ 2. **end for** 3. Sort $\{\theta_1^*, \theta_2^*, ..., \theta_B^*\}$ in ascending order 4. Set $\theta_L = \text{quantile}(\theta^*, \alpha/2)$ 5. Set $\theta_U = \text{quantile}(\theta^*, 1-\alpha/2)$ 6. **return** $[\theta_L, \theta_U]$ ::: ### Example 2: Simple Pseudocode Block ```pseudocode ALGORITHM: Linear Regression via Gradient Descent INPUT: X (n × p design matrix), y (n × 1 response vector), learning rate α OUTPUT: β (p × 1 coefficient vector) 1. Initialize β = 0, iteration = 0, maxiter = 1000 2. REPEAT until convergence OR iteration > maxiter: a. Compute predictions: ŷ = Xβ b. Compute residuals: r = y - ŷ c. Compute gradient: ∇L = -2X'r/n d. Update parameters: β = β - α∇L e. iteration = iteration + 1 3. RETURN β ``` ### Example 3: Detailed Algorithm with Comments ::: {.algorithm} **Algorithm 2** Maximum Likelihood Estimation **Input:** Data $X$, log-likelihood function $\ell(\theta|X)$, initial value $\theta_0$ **Output:** $\hat{\theta}_{MLE}$ 1. Set $\theta \leftarrow \theta_0$, $t \leftarrow 0$ 2. **while** not converged **do** - Compute score function: $s(\theta) = \nabla_\theta \ell(\theta|X)$ - Compute Hessian: $H(\theta) = \nabla^2_\theta \ell(\theta|X)$ - Update: $\theta \leftarrow \theta - H(\theta)^{-1}s(\theta)$ - Check convergence: **if** $\|s(\theta)\| < \epsilon$ **then** break - $t \leftarrow t + 1$ - **if** $t > \text{max\_iterations}$ **then** break 3. **end while** 4. Compute standard errors: $\text{SE}(\hat{\theta}) = \sqrt{\text{diag}(-H(\hat{\theta})^{-1})}$ 5. **return** $\hat{\theta}_{MLE}$ ::: ## Comparing Approaches Here's how the same analysis looks across languages: ### Task: Calculate Mean and 95% CI **R:** ```r data <- rnorm(100, mean = 5, sd = 2) mean_val <- mean(data) ci <- t.test(data)$conf.int ``` **Julia:** ```julia using Statistics, HypothesisTests data = randn(100) .* 2 .+ 5 mean_val = mean(data) ci = confint(OneSampleTTest(data)) ``` **Stata:** ```stata clear set obs 100 generate data = rnormal(5, 2) mean data ``` **Python:** ```python import numpy as np from scipy import stats data = np.random.normal(5, 2, 100) mean_val = np.mean(data) ci = stats.t.interval(0.95, len(data)-1, loc=mean_val, scale=stats.sem(data)) ``` ## Best Practices 1. **Use syntax highlighting for display-only code** - Most readers just need to see the code 2. **Execute code sparingly** - Only execute when output is essential 3. **Document your setup** - Note required packages and versions 4. **Test before publishing** - Ensure code runs without errors 5. **Provide context** - Explain what the code does and why ## Resources - [Quarto Stata Documentation](https://quarto.org/docs/computations/stata.html) - [Quarto Julia Documentation](https://quarto.org/docs/computations/julia.html) - [Highlight.js Language Support](https://highlightjs.org/static/demo/) ## Conclusion With support for multiple languages, you can now share statistical methods and algorithms in the language that best suits your analysis and audience.