The practice of double entry bookkeeping originated in renaissance Italy as a simple method for merchants to keep track of their daily transactions. The merchant who invented this concept sought to track the inflows of money into his business, while also tracking the outflows of jewelry that he sold. To do so, he devised this concept of credits and debits, and the basis of modern double-entry bookkeeping. Double entry bookkeeping soon spread from Italy to the rest of the world, as Italy was a great center of trade, renowned the world over for their innovations in commerce, science, industry, and the arts. While most people associate the renaissance with art, architecture, and sculpture- the foundations of modern civics and finance also took root during this time in history. From that humble beginning as as practice of a few Venice merchants, Accounting has become a globally important practice and spawned numerous entire professions; professions in which millions of finance professionals ply their trade every day.
While the concepts of double entry bookkeeping took hold - new issues arose - and banks and financiers started to form more formal bodies of rules to dictate exactly how these practices should take place. Specifics of how to record transactions, when to record them, and how to “balance the books”, became essential to the global practice of accounting. That leads us to modern times and the rules and formulas we will discuss in this book. Modern accounting practices in the US follow a rule set called the Generally Accepted Accounting Principles (GAAP) - a set of rules and advice from the Financial Accounting Standards Board (FASB). Although GAAP offers some black and white guidance, it also has gray areas, limits and loopholes that may be exploited.
This guide translates the key elements and concepts from the Accountant’s toolbox into reusable R code, and and presents textbook-style guide. In this guide, each chapter contains a different general subject matter, and contains a variety of formulas that can be called as a function. These functions may be called in a series, or nested into other formulas, to generate useful calculations and visualizations.
This guide is intended for finance professionals, statisticians, data scientists, math teachers, students*, or anyone else who needs to generate, visualize, apply, or solve accounting problems.
This R markdown file will use several packages as defined below. This code will load the packages only of they are not already installed. These packages will be loaded into the local scope via the library functions.
if(!require(pracma)) install.packages("pracma", repos = "http://cran.us.r-project.org")
if(!require(dplyr)) install.packages("dplyr", repos = "http://cran.us.r-project.org")
if(!require(ggplot2)) install.packages("ggplot2", repos = "http://cran.us.r-project.org")
if(!require(ggforce)) install.packages("ggforce", repos = "http://cran.us.r-project.org")
if(!require(gtools)) install.packages("gtools", repos = "http://cran.us.r-project.org")
if(!require(matlib)) install.packages("matlib", repos = "http://cran.us.r-project.org")
if(!require(MASS)) install.packages("MASS", repos = "http://cran.us.r-project.org")
if(!require(kableExtra)) install.packages("kableExtra", repos = "http://cran.us.r-project.org")
if(!require(RcppAlgos)) install.packages("RcppAlgos", repos = "http://cran.us.r-project.org")
if(!require(latex2exp)) install.packages("latex2exp", repos = "http://cran.us.r-project.org")
if(!require(openxlsx)) install.packages("openxlsx", repos = "http://cran.us.r-project.org")
library(pracma)
library(dplyr)
library(ggplot2)
library(ggforce)
library(gtools)
library(matlib)
library(MASS)
library(kableExtra)
library(RcppAlgos)
library(latex2exp)
library(openxlsx)
options(scipen=999)
Since accounting deals primarily in the transfer of money, it is important for any system of tracking to start with some data collection rules and standards. When we make a sale we are collecting income - when we pay a bill we are incurring expenses.
Data needs to be collected about each transaction, so we can keep track of what the income or expense was for. R provides a powerful platform for accounting, as it combines data management, visualization, and advanced mathematics into one platform.
So, to start out - we will consider what kinds of data you might need to collect for a bookkeeping system. Some of the obvious details that come to mind for a transaction are things like the date and what was sold. Some less obvious things you want to track in an accounting environment - are things like the account code, transaction type (debit or credit) and other codes like a purchase order, invoice number, or other categorical / filing details.
A transaction is an exchange of goods or value. Some types of transactions may result in nothing actually changing hands - rather they may simply be transfers between different parts of your account ledger. But the most obvious example of a transaction is when you sell something. In this case, you are moving money out of your assets (inventory) and into your income (sales).
The idea of debits and credits can be confusing at first. For some accounts, a debit is positive and a credit is negative, for other accounts a debit is negative and a credit is positive. This depends on which side of the balance sheet the account belongs to.
Assets are the amounts & things you Own.
Liabilities are the amounts & things you Owe.
Income is the inflow of money into your business.
Expenses are the things you pay out of your business.
Equity is your ownership or net value of your business.
Revenue is your profits on income, minus expenses on the things you sold.
To record a transaction in R, you would probably want to start with an R data frame object. A data frame or data table, is the foundation of a more complex structure of relational tables that might track transactions and help you balance your ledger. SOme of the details we might want to record for a transaction include the date and amount of the transaction, what type of transaction it is, and who we transacted with. Lets see how you might do that using R code:
The idea behind double entry bookkeeping is to have a list of accounts, that taken together should always balance out to 0. This means for every dollar you take in, you place a dollar into one account, and take a dollar out of another account.
accounts_df <- data.frame()
accounts_df
A general ledger table should collect all the key details about each transaction.
ledger_df <- data.frame()
ledger_df
A cash flow statement should record movements into and out of cash accounts
cashflow_df <- data.frame()
cashflow_df
An income statement should record profit and expense related to goods sold
income_df <- data.frame()
income_df
Now we will create a few functions to start adding lines to our account and ledger tables. We will need accounts to be set up before enteriong any transactions - so we will start there.
add_account <- function(
category,
sub_category,
acct_name,
acct_code,
acct_open_date,
acct_balance,
acct_direction,
acct_permanent){
new_row <- data.frame(
category=category,
sub_category=sub_category,
acct_name=acct_name,
acct_code=acct_code,
acct_open_date=acct_open_date,
acct_balance=acct_balance,
acct_direction=acct_direction,
acct_permanent=acct_permanent
)
accounts_df <<- accounts_df %>% bind_rows(new_row)
}
OK, we now have a mechanism (function) to add data to our accounts table. Lets start by adding some key accounts with 0 balances. This will create our initial “chart of accounts”
#set the open_date for all of these accounts
open_date <- as.Date("2020-1-1")
#add_account(category,sub_category,acct_name,acct_code,acct_open_date,acct_balance)
add_account("Assets","Cash","Cash on Hand",1111,open_date,0,1,1)
add_account("Assets","Cash","Checking Account",1112,open_date,0,1,1)
add_account("Assets","Cash","Money Market",1113,open_date,0,1,1)
add_account("Assets","Accounts receivable","Credit Receipts",1211,open_date,0,1,1)
add_account("Assets","Inventory","Raw Materials",1411,open_date,0,1,1)
add_account("Assets","Inventory","Raw Packaging",1412,open_date,0,1,1)
add_account("Assets","Inventory","Finished Products",1420,open_date,0,1,1)
add_account("Assets","Land & Buildings","Land",1611,open_date,0,1,1)
add_account("Assets","Land & Buildings","Buildings",1612,open_date,0,1,1)
add_account("Liabilities","Accounts payable","Accounts payable",2111,open_date,0,0,1)
add_account("Liabilities","Debt","Long-Term Debt",2112,open_date,0,0,1)
add_account("Equity","Owners Equity","Capital",3111,open_date,0,0,1)
add_account("Equity","Owners Equity","Retained Earnings",3112,open_date,0,0,1)
add_account("Revenues","Sales","Sales Revenue",4011,open_date,0,0,0)
add_account("Expenses","Operating Expenses","Wages",4111,open_date,0,1,0)
add_account("Expenses","Operating Expenses","Taxes",4112,open_date,0,1,0)
add_account("Expenses","Occupancy","Rent",4113,open_date,0,1,0)
add_account("Expenses","Advertising","Advertising",4114,open_date,0,1,0)
add_account("Expenses","Communications","Phone",4115,open_date,0,1,0)
add_account("Expenses","Inventory Expense","Raw Materials",5111,open_date,0,1,0)
add_account("Expenses","Inventory Expense","Packaging",5112,open_date,0,1,0)
add_account("Expenses","Inventory Expense","Finished Products",5113,open_date,0,1,0)
Lets take a look at our chart of accounts now.
accounts_df
Now we will create a few functions to start adding lines to our account and ledger tables. We will need these accounts to be set up before entering any transactions - so we’ll start there.
#create a function to add transactions
add_transaction <- function(
trans_amount=NULL,
acct_lookup=NULL,
trans_date=NULL,
budget_code=NULL,
po_number=NULL,
invoice_number=NULL,
payment_date=NULL,
payment_method=NULL){
validated = TRUE
#make sure at least the account code and $ amount are set
if(is.null(trans_amount)){
msg_confirm <<- paste('Non-0 Transaction amount is required.')
validated = FALSE
}
if(is.null(acct_lookup)){
msg_confirm <<- paste('Acct code required.')
validated = FALSE
}
if(!validated){
success <- FALSE
return(success)
}
#set date to now if not provided
if(is.null(trans_date)){
trans_date = Sys.time()
}
#set budget_code to blank if not provided
if(is.null(trans_date)){
budget_code = ""
}
#set po_number to blank if not provided
if(is.null(po_number)){
po_number = ""
}
#set invoice_number to blank if not provided
if(is.null(invoice_number)){
invoice_number = ""
}
#set payment_date to blank if not provided
if(is.null(payment_date)){
payment_date = ""
}
#set payment_method to blank if not provided
if(is.null(payment_method)){
payment_method = ""
}
#get the name of this account to stamp on the ledger
acct_name <- accounts_df %>% filter(acct_code==acct_lookup) %>% pull(acct_name)
#get the direction of this account to determine the transaction type
acct_direction <- accounts_df %>% filter(acct_code==acct_lookup) %>% pull(acct_direction)
if(acct_direction ==1){
#a positive number is a debit
if(trans_amount>0){
trans_type = "Debit"
}else{
trans_type = "Credit"
}
}else{
#a positive number is a credit
if(trans_amount>0){
trans_type = "Credit"
}else{
trans_type = "Debit"
}
}
abs_trans_amount <- abs(trans_amount)
#create a row for the transaction
new_row <- data.frame(
trans_date=trans_date,
trans_amount=abs_trans_amount,
trans_type=trans_type,
acct_name=acct_name,
acct_code=acct_lookup,
budget_code=budget_code,
po_number=po_number,
invoice_number=invoice_number,
payment_date=payment_date,
payment_method=payment_method
)
#get the old balance
curr_balance <- accounts_df %>% filter(acct_code==acct_lookup) %>% pull(acct_balance)
#add this transaction amount
new_balance <- curr_balance + trans_amount
#set a confirmation message in the environment
msg_confirm <<- paste('Balance was updated due to a ',trans_type,' - from ',curr_balance,' to ',new_balance,' on account:',acct_name,sep='')
#update the account with the new balance
accounts_df <<- accounts_df %>% mutate(acct_balance = ifelse(acct_code == acct_lookup,new_balance,acct_balance))
#return the ledger
ledger_df <<- ledger_df %>% bind_rows(new_row)
success <- TRUE
return(success)
}
Now we will add the 10K initial cash investment by adding 10K to cash (on the left side of the balance sheet) and 10K to Equity on the right side.
#set up one transaction of $10000 to cash
#the add_transaction function will figure out if its a debit or credit, based on if I pas a positive or negative number.
trans_amount <- 10000
acct_code <- 1111
trans_date <- Sys.time()
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_A <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
If the operation succeeded, we should get a message that it did.
if(success_A){
#the transaction recorded correctly
msg_confirm
}else{
#the transaction recorded incorrectly
msg_confirm
}
## [1] "Balance was updated due to a Debit - from 0 to 10000 on account:Cash on Hand"
Now we’ll examine the accounts and ledger.
ledger_df
accounts_df
Looks good - as expected, the Cash on Hand balance is now $10,000.
So we now have exactly one transaction in our ledger. But double entry bookkeeping means that you should really have 2 entries (at least) for each transaction. This is how you will maintain your balance sheet - by adding a debit in one account, and an equal credit in another (usually).
So lets go ahead and add another transaction in the same amount - this time adding 10K in owner equity - this is what the business “owes” it’s owner in terms of stock or capital:
if(success_A){
trans_amount <- 10000
acct_code <- 3111
trans_date <- as.Date("2020-1-1")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_B <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
}
If the operation succeeded, we should get a message that it did.
if(success_B){
msg_confirm
}else{
msg_confirm
}
## [1] "Balance was updated due to a Credit - from 0 to 10000 on account:Capital"
simple_plot <- ggplot() +
geom_col(data=accounts_df, aes(x=acct_name,y=acct_balance,fill=category)) +
ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)') + theme(axis.text.x = element_text(angle = 90))
simple_plot
Now we need to take some money out of our cash on hand, and buys some materials we want to sell Since the owner has funded the initial $1000 investment - we just need to make some transfers now from our cash account, into our other accounts we are tracking. We will take $500 in cash and go to our distributor, where we buy $350 in raw materials, and $150 in packaging.
#set up one transaction of $10000 to cash
#the add_transaction function will figure out if its a debit or credit, based on if I pas a positive or negative number.
trans_amount <- -500
acct_code <- 1111
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_A <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
If the operation succeeded, we should get a message that it did.
if(success_A){
msg_confirm
}else{
msg_confirm
}
## [1] "Balance was updated due to a Credit - from 10000 to 9500 on account:Cash on Hand"
Now we’ll examine the accounts and ledger.
ledger_df
accounts_df
simple_plot <- ggplot() +
geom_col(data=accounts_df, aes(x=acct_name,y=acct_balance,fill=category)) +
ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)') + theme(axis.text.x = element_text(angle = 90))
simple_plot
Now that we have spent some of our cash on hand, we need to recall what we spent it on. In this case, were purchasing some raw materials ($350) and packaging ($150) for our business.
if(success_A){
trans_amount <- 350
acct_code <- 1411
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_B <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
trans_amount <- 150
acct_code <- 1412
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_C <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
}
If the operation succeeded, we should get a message that it did.
if(success_B){
msg_confirm
}else{
msg_confirm
}
## [1] "Balance was updated due to a Debit - from 0 to 150 on account:Raw Packaging"
Then we take another $300, go down to the furniture store to buy some tables, a market umbrella, and some other fixtures we will use to sell this stuff.
Now we’ll examine the accounts and ledger.
ledger_df
accounts_df
simple_plot <- ggplot() +
geom_col(data=accounts_df, aes(x=acct_name,y=acct_balance,fill=category)) +
ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)') + theme(axis.text.x = element_text(angle = 90))
simple_plot
#use half teh raw materials...
trans_amount <- -175
acct_code <- 1411
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_A <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
#and half teh raw packaging...
trans_amount <- -75
acct_code <- 1412
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_B <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
#to make $250 (at cost) worth of finished products
trans_amount <- 250
acct_code <- 1420
trans_date <- as.Date("2020-1-3")
budget_code <- ""
po_number <- ""
invoice_number <- ""
payment_date <- ""
payment_method <- ""
success_C <- add_transaction(
trans_amount,
acct_code,
trans_date,
budget_code,
po_number,
invoice_number,
payment_date,
payment_method)
So now we have two transactions in the ledger that balance each other. Lets take it a step further and define some common types of transactions and which accounts they affect., These rules will become built-in, multi-step ledger entries to ensure that when entering new transactions, we do so in as few steps as possible, and always using a consistent process for each series of transactions.
GAAP accounting as some formal elements that can help us work out what those rules should be. The income statement is a great way to figure out what portions of each sale are attributable to what parts of the balance sheet. In this example, the cash is entered for the cash on hand account, inventory removed from the inventory area, and the net profit is entered into equity.
Once we have acquired raw goods, we’ll need to make them into finished products for sale, and sell them!
To keep track of this series of steps, we need to make some entries in the ledger. The entries that will be needed include debits or credits to the following accounts:
Cash on Hand (1111) Sales Revenue (4011) Raw Packaging Raw Materials Retained Earnings
This is a complex series of steps. Equations are needed to take some inputs and produce a correct set of entries in the ledger. This calls for a reusable function. The function will take as inputs:
Sale Amount
Cost of Goods Sold
Operating Expense
Less Taxes
And from these inputs it will calculate some other key factors including:
Gross profit (Sales - COGS)
Operating Profit (Gross profit - Operating Expense)
Net Income (Operating profit - Less Taxes)
Since this is a guide about accounting, and not manufacturing per-se - we are not going to worry too much about the details of the manufacturing process just yet. For now, we will just estimate some of the production costs that go into our products; later in this guide, we will look at how to more accurately track the production costs through - you guessed it - double-entry bookkeeping.
add_cash_income <- function(
sales,
cogs,
operating_expense,
less_taxes){
gross_profit <- sales - cogs
operating_profit <- gross_profit-operating_expense
net_income <- operating_profit - less_taxes
#create a new data row with all the input & calculated values
new_row <- data.frame(
trans_date=Sys.time(),
sales=sales,
cogs=cogs,
gross_profit=gross_profit,
operating_expense=operating_expense,
operating_profit=operating_profit,
less_taxes=less_taxes,
net_income=net_income
)
#add the row to income df
income_df <<- income_df %>% bind_rows(new_row)
#get the old balance
curr_balance <- accounts_df %>% filter(acct_code==1111) %>% pull(acct_balance)
#create a new row for the cashflow table
new_row <- data.frame(
trans_date=Sys.time(),
begin_cash=curr_balance,
collected_cash=sales,
paid_cash=0,
end_cash=curr_balance+sales
)
cashflow_df <<- cashflow_df %>% bind_rows(new_row)
#get the time
time_stamp <- Sys.time()
#add the sales as a positive
add_transaction(
sales,
1111,
time_stamp,
"",
"",
"",
"",
"")
#add the sales as a positive (credit) to revenue
add_transaction(
sales,
4011,
time_stamp,
"",
"",
"",
"",
"")
#add the cost of good sold as a negative to finished inventory
add_transaction(
-cogs,
1420,
time_stamp,
"",
"",
"",
"",
"")
#add the cost of good sold as a positive to inventory expenses for raw materials
add_transaction(
cogs,
5111,
time_stamp,
"",
"",
"",
"",
"")
#add the net income to owner equity retained earnings
add_transaction(
net_income,
3112,
time_stamp,
"",
"",
"",
"",
"")
}
We have the function for making cash sales - now it’s time to use it.
In this scenario will go to a local dealers showcase and make some sales of our products. We will consider operating expense as 0 and tax as 0 since these are wholesale sales, and the maker does not yet take a salary - so we will only count the cost of the goods sold considering only raw materials.
Luckily, our dealers showcase was successful - we sold all $250 worth of product (at cost) that we made.
sale_amount <- 60
cogs <- 30
operating_expense <- 0
less_taxes <- 0
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 40
cogs <- 20
operating_expense <- 0
less_taxes <- 0
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 120
cogs <- 75
operating_expense <- 0
less_taxes <- 0
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 110
cogs <- 65
operating_expense <- 0
less_taxes <- 0
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
sale_amount <- 90
cogs <- 60
operating_expense <- 0
less_taxes <- 0
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
cashflow_df
income_df
ledger_df
accounts_df
The cash flow statement tracks cash transactions only.
The balance sheet is a critical tool for managing the inflows and outflows of money between the various accounts. This is where the double-entry concept really comes in - for each transaction on “one side” of the balance sheet - there should be a matching transaction on th e:other side” - somewhere. Knowing where the transaction should appear is really what GAAP rules help define.
So in the above example - I have traded $10,000 in cash for $10,000 in capital. My Equity has increased on the one side, and my assets have increased on the other.
There are a couple key formulas we need to introduce now:
So in this case, the balance sheet should be balanced - we should have 10,000 in assets and an equal amount in Equity.
So lets check out our balance sheet now by summing up each of the accounts we have defined above.
balance_sheet <- function(){
accts_total <<- accounts_df %>% group_by(category,acct_direction) %>% summarize(curr_balance=sum(acct_balance))
assets_balance <<- accts_total %>% filter(category == "Assets") %>% pull(curr_balance)
liabilities_balance <<- accts_total %>% filter(category == "Liabilities") %>% pull(curr_balance)
equity_balance <<- accts_total %>% filter(category == "Equity") %>% pull(curr_balance)
revenue_balance <<- accts_total %>% filter(category == "Revenues") %>% pull(curr_balance)
expenses_balance <<- accts_total %>% filter(category == "Expenses") %>% pull(curr_balance)
are_permanent_accounts_balanced <- assets_balance==liabilities_balance+equity_balance
are_permanent_accounts_balanced
}
balance_sheet()
## [1] TRUE
accts_total
simple_plot <- ggplot() +
geom_col(data=accts_total, aes(x=category,y=curr_balance,fill=category)) +
ggtitle('Balance Sheet') + labs(x='Account Category',y='Balance ($)')
simple_plot
Here we can check if the permanent* accounts are balanced:
(Assets, Liabilities, Equity and Permanent accounts. Revenue and expenses are “temporary” accounts which reset each year.)
are_permanent_accounts_balanced <- assets_balance == liabilities_balance+equity_balance
are_permanent_accounts_balanced
## [1] TRUE
So far, so good - we have balanced our accounts.
cashflow_df
profit_balance <- revenue_balance-expenses_balance
profit_balance
## [1] 170
And we now have made profit_balance
in profit.
cashflow_df$id <- as.numeric(row.names(cashflow_df))
simple_plot <- cashflow_df %>% ggplot(aes(x=id,y=end_cash)) +
geom_point() +
geom_line(aes(x=id,y=end_cash)) +
ggtitle('Cash Flow Summary') + labs(x='Transaction',y='Cash')
simple_plot
We have explored how to make entries into a general ledger, income statement, cash flow statement, and balance sheet.
Now, where do you go from here? Perhaps you would like to have the cost of goods sold (COGS) be calculated for you? Let see how that might work by exploring the inventory account more thoroughly.
To make this all more tangible, lets image we have a soap company. We make soaps, lotions, fizzy bath bombs, and other personal care products. Now lets list out some recipes for our most popular products. This will require setting up some relational data frames that will cross reference each other. The tables should follow relational rules and strive to approximate third normal form.
This table keeps the unit of measure of each ingredient source. Some are measured in oz. dry weight, some in fl. oz. liquid volume, and others are measured by the drop. Keeping the units of measure will help in case we ever need to convert between like measures, say, oz vs lb. or liters vs floz.
unit_df <- data.frame()
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=1,unit='oz',unit_name='ounce',unit_type='Dry Weight',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=2,unit='floz',unit_name='fluid ounce',unit_name='Liquid Volume',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=3,unit='drop',unit_name='droplet',unit_name='Liquid Volume',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=4,unit='lb',unit_name='pound',unit_name='Dry Weight',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=5,unit='ea',unit_name='each',unit_name='Each',system='standard'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=6,unit='g',unit_name='gram',unit_name='Dry Weight',system='metric'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=7,unit='l',unit_name='liter',unit_name='Liquid Volume',system='metric'))
unit_df <- unit_df %>% bind_rows(data.frame(unit_id=8,unit='gal',unit_name='gallon',unit_name='Liquid Volume',system='metric'))
This table stores some recipes we will analyze for cost & profit.
recipe_df <- data.frame()
recipe_df <- recipe_df %>% bind_rows(data.frame(recipe_id=1,prod_code='BB-LAV',recipe_name='bath bomb - lavendar',batch_size=20,price_each=4,price_batch=80))
recipe_df <- recipe_df %>% bind_rows(data.frame(recipe_id=2,prod_code='BB-ROSE',recipe_name='bath bomb - rose',batch_size=16,price_each=4.50,price_batch=72))
recipe_df <- recipe_df %>% bind_rows(data.frame(recipe_id=3,prod_code='BB-ORNG',recipe_name='bath bomb - citrus',batch_size=24,price_each=3.75,price_batch=90))
This table keeps track of our raw ingredients
ingred_df <- data.frame()
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=1,ing_code='EP-SALT',ingredient_name='epsom salt',unit_id=1))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=2,ing_code='B-SODA',ingredient_name='baking soda',unit_id=1))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=3,ing_code='COCO-OIL',ingredient_name='coconut oil',unit_id=2))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=4,ing_code='ESS-OIL-LAV',ingredient_name='essential oil - lavender',unit_id=3))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=5,ing_code='ESS-OIL-ROSE',ingredient_name='essential oil - rose',unit_id=3))
ingred_df <- ingred_df %>% bind_rows(data.frame(ing_id=6,ing_code='ESS-OIL-ORNG',ingredient_name='essential oil - citrus',unit_id=3))
This table keeps track of what raw ingredients are needed for each recipe, in what amounts.
recipeparts_df <- data.frame()
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=1,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=2,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=3,recipe_amount=6))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=1,ing_id=4,recipe_amount=40))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=1,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=2,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=3,recipe_amount=6))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=2,ing_id=5,recipe_amount=50))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=1,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=2,recipe_amount=16))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=3,recipe_amount=6))
recipeparts_df <- recipeparts_df %>% bind_rows(data.frame(recipe_id=3,ing_id=6,recipe_amount=30))
We can look up any of our recipes by ID using this:
recipe_df %>%
left_join(recipeparts_df,by='recipe_id') %>%
left_join(ingred_df,by='ing_id') %>%
left_join(unit_df,by='unit_id') %>%
filter(recipe_id==1) %>%
dplyr::select(prod_code,recipe_name,recipe_amount,unit,ing_code,ingredient_name, price_each, price_batch, batch_size)
OK, so here is a recipe we can use to calculate cost of goods sold (COGS). To do so, we need to know what the cost of all the individual ingredients are in this recipe.
This table will store the purchase price and unit size of each source package, a portion of which will be used by each recipe.
ingred_cost_df <- data.frame()
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=1,cost_amount=12,unit_size=48))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=2,cost_amount=9,unit_size=32))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=3,cost_amount=5,unit_size=24))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=4,cost_amount=4,unit_size=160))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=5,cost_amount=5,unit_size=160))
ingred_cost_df <- ingred_cost_df %>% bind_rows(data.frame(ing_id=6,cost_amount=6,unit_size=160))
OK, lets see how our recipe stacks up now that we have our material costs factored in. Drumroll please…
recipe_lookup <- 1
recipe_cost <- recipe_df %>%
left_join(recipeparts_df,by='recipe_id') %>%
left_join(ingred_df,by='ing_id') %>%
left_join(unit_df,by='unit_id') %>%
left_join(ingred_cost_df,by='ing_id') %>%
filter(recipe_id==recipe_lookup) %>%
mutate(cost_ing = cost_amount * recipe_amount / unit_size, cost_ing_per = (cost_amount * (recipe_amount / unit_size))/batch_size,) %>%
dplyr::select(prod_code,recipe_name,recipe_amount,unit,ing_code,ingredient_name,cost_ing,batch_size,cost_ing_per)
recipe_cogs <- recipe_cost %>% summarize(batch_cost=sum(cost_ing)) %>% pull(batch_cost)
recipe_value <- recipe_df %>% filter(recipe_id==recipe_lookup) %>% pull(price_batch)
gross_profit <- recipe_value-recipe_cogs
profit_margin <- round((gross_profit / recipe_cogs) * 100,2)
paste('The cost of good sold on this batch is',recipe_cogs,' for a gross profit of',gross_profit,' and a profit margin on this recipe of:',profit_margin,'% ')
## [1] "The cost of good sold on this batch is 10.75 for a gross profit of 69.25 and a profit margin on this recipe of: 644.19 % "
Not bad, our recipe yields a healthy profit margin of profit_margin
. So as we sell product we can now closely track the COGS for use with our rule-based account entry function.
recipe_list_df <- recipe_df %>%
left_join(recipeparts_df,by='recipe_id') %>%
left_join(ingred_df,by='ing_id') %>%
left_join(unit_df,by='unit_id') %>%
left_join(ingred_cost_df,by='ing_id') %>%
mutate(cost_ing = cost_amount * recipe_amount / unit_size, cost_ing_per = (cost_amount * (recipe_amount / unit_size))/batch_size,) %>%
dplyr::select(recipe_id,prod_code,recipe_name,recipe_amount,unit,ing_code,ingredient_name,cost_ing,batch_size,cost_ing_per,price_batch,price_each,batch_size)
recipe_list_df
recipe_summary_df <- recipe_list_df %>% group_by(recipe_id,recipe_name,price_batch,price_each,batch_size) %>% summarize(recipe_cost=sum(cost_ing)) %>% ungroup() %>% mutate(each_cost=recipe_cost/batch_size)
recipe_summary_df
By using precise measurements of our materials, we can determine the correct price for each item we sell, based on our desired profit margin. IN this example, we can see that slight differences in the amount of materials used, and the cost of those materials, may impact several factors like the yield (quantity produced) of each batch, the price of each item and the price of the batch. This can also be helpful for planning for the amount of materials we need to order to produce a certain amount of end products.
It is a good idea to store some basic information about your customers, if available. Personally identifiable information is subject to legal protection. Be sure to secure and protect this information if you collect and store it!
cust_number <- 1001
customer_df <- data.frame()
customer_df <- customer_df %>% bind_rows(data.frame(
cust_id=cust_number,
fname='Joe',
lname='Smith',
email='[email protected]',
phone='17072914948',
optin=1
))
Set some meta data for a new invoice. Depending on whether its a sale or purchase, it may have some fields provided or omitted.
invoice_number <- 100001
invoice_header_df <- data.frame(
budget_code='A1',
po_number='',
cust_id=cust_number,
invoice_number=invoice_number,
payment_date=Sys.time(),
payment_method='Cash')
An invoice should list each individual item purchased and the quantity of that item, its individual cost, and the extension cost of the entire line item.
invoice_df <- data.frame()
invoice_df <- invoice_df %>% bind_rows(data.frame(invoice_number=invoice_number,recipe_id=1,unit_quantity=6))
invoice_df <- invoice_df %>% bind_rows(data.frame(invoice_number=invoice_number,recipe_id=2,unit_quantity=8))
invoice_df <- invoice_df %>% bind_rows(data.frame(invoice_number=invoice_number,recipe_id=3,unit_quantity=10))
invoice_df
You can figure out the costs for each line item using some filters and summation functions.
#figure out the cost by taking the per item cost and multiplying it by quantity
order_cost <- invoice_df %>%
left_join(recipe_summary_df,by='recipe_id') %>%
mutate(line_price = unit_quantity * price_each, line_cost = unit_quantity * each_cost) %>%
dplyr::select(unit_quantity,recipe_name,price_each,line_price, line_cost, each_cost, recipe_cost)
order_cost
#get the order total
order_total <- order_cost %>% summarize(order_total=sum(line_price)) %>% pull(order_total)
#get the cost total
order_cogs <- order_cost %>% summarize(order_cost=sum(line_cost)) %>% pull(order_cost)
#determine gross profit
order_profit <- order_total - order_cogs
#round to 2 decimal places
order_cogs <- round(order_cogs,2)
order_profit <- round(order_profit,2)
#Calculate profit margin
profit_margin <- round((order_profit / order_cogs) * 100,2)
#write to screen
paste('The cost of goods sold on this order is $',order_cogs,' for a gross profit of $',order_profit,' and a profit margin on this recipe of:',profit_margin,'% ',sep="")
## [1] "The cost of goods sold on this order is $13.41 for a gross profit of $84.09 and a profit margin on this recipe of:627.07% "
OK… this invoice has a few lines for order_total
dollars and we can see exactly what amount of this is cost: order_cogs
#order lines to customer
order_invoice <- order_cost %>% dplyr::select(unit_quantity,recipe_name,price_each,line_price)
order_invoice
get_next_id <- function(){
next_id <- invoice_header_df %>% arrange(-invoice_number) %>% top_n(1) %>% pull(invoice_number)
next_id <- next_id + 1
next_id
}
get_next_id()
## [1] 100002
Now, recall our add_cash_sale function which need both an amount in cash collected, and a cost of goods sold. Now we can more easily find this for each invoice we create.
#pull the sale amount and cost off the invoice generator
sale_amount <- order_total
cogs <- order_cogs
operating_expense <- 0
less_taxes <- 0
add_cash_income(sale_amount,cogs,operating_expense,less_taxes)
## [1] TRUE
balance_sheet()
## [1] TRUE
accts_total
simple_plot <- ggplot() +
geom_col(data=accts_total, aes(x=category,y=curr_balance,fill=category)) +
ggtitle('Account Balances') + labs(x='Account Category',y='Balance ($)')
simple_plot
It’s important to store some basic information about your workers.
worker_id_seed <- 1001
worker_df <- data.frame()
get_next_worker_id <- function(){
next_worker_id <- worker_df %>% arrange(-worker_id) %>% slice(1) %>% pull(worker_id)
next_worker_id <- next_worker_id + 1
next_worker_id
}
worker_number <- worker_id_seed
worker_df <- worker_df %>% bind_rows(data.frame(
worker_id=worker_number,
fname='Laura',
lname='Jones',
email='[email protected]',
phone='15056221212',
worker_hourly_wage=18,
worker_hourly_tax=3.60
))
worker_number <- get_next_worker_id()
worker_df <- worker_df %>% bind_rows(data.frame(
worker_id=worker_number,
fname='Jared',
lname='Wilson',
email='[email protected]',
phone='15056221213',
worker_hourly_wage=16,
worker_hourly_tax=3.20
))
worker_number <- get_next_worker_id()
worker_df <- worker_df %>% bind_rows(data.frame(
worker_id=worker_number,
fname='Lawrence',
lname='Davis',
email='[email protected]',
phone='15056221214',
worker_hourly_wage=16.80,
worker_hourly_tax=3.40
))
worker_df
simple_plot <- ggplot() +
geom_col(data=worker_df, aes(x=paste(fname,lname),y=worker_hourly_wage,fill=paste(fname,lname))) +
ggtitle('Worker Wages') + labs(x='Worker',y='Wage per Hour')
simple_plot
Now that we have determined the cost for the raw materials - what about the labor involved in making the product - this should also be considered part of the cost. Since different employees get paid different amounts - w can either take the average cost of each item, or try to track the batch by batch costs using first in first out (FIFO) or last in first out (LIFO).
#set the worker's hourly wage & taxes you have to pay
worker_hourly_wage <- 16
worker_hourly_tax <- 3.20
#set the time to make each batch
batch_time_hr <- .5
#set the time to make each batch
batch_labor_cost <- batch_time_hr * (worker_hourly_wage + worker_hourly_tax)
batch_labor_cost
## [1] 9.6
per_item_labor <- batch_labor_cost / 20
per_item_labor
## [1] 0.48
Each time we create a batch of products, we will enter this in a productivity tracker table. This will help us recall: A. What were the ingredient costs of that specific batch B. What were the labor costs of that specific batch
batch_id_seed <- 1001
batch_df <- data.frame()
get_next_batch_id <- function(){
next_batch_id <- batch_df %>% arrange(-batch_id) %>% slice(1) %>% pull(batch_id)
next_batch_id <- next_batch_id + 1
next_batch_id
}
add_batch <- function(
batch_id=NULL,
batch_date=Sys.time(),
recipe_id=NULL,
worker_id=NULL,
batch_time_hr=NULL){
batch_df <<- batch_df %>% bind_rows(data.frame(
batch_id=batch_id,
batch_date=batch_date,
recipe_id=recipe_id,
worker_id=worker_id,
batch_time_hr=batch_time_hr
))
}
batch_number <- batch_id_seed
batch_df <- batch_df %>% bind_rows()
batch_number <- batch_id_seed
add_batch(
batch_id=batch_number,
batch_date=Sys.time(),
recipe_id=1,
worker_id=1001,
batch_time_hr=.65
)
batch_number <- get_next_batch_id()
add_batch(
batch_id=batch_number,
batch_date=Sys.time(),
recipe_id=2,
worker_id=1002,
batch_time_hr=.75
)
batch_number <- get_next_batch_id()
add_batch(
batch_id=batch_number,
batch_date=Sys.time(),
recipe_id=3,
worker_id=1003,
batch_time_hr=.5
)
batch_df
Having created a few batches, we can start to analyze the cost of each batch in terms of both materials and labor.
batch_cost_df <- batch_df %>%
left_join(recipe_summary_df, by="recipe_id") %>%
left_join(worker_df, by="worker_id") %>%
mutate(batch_wage_cost = batch_time_hr * worker_hourly_wage,
batch_tax_cost = batch_time_hr * worker_hourly_tax,
each_wage_cost = (batch_time_hr * worker_hourly_wage)/batch_size,
each_tax_cost = (batch_time_hr * worker_hourly_tax)/batch_size
)
Now we have added some labor costs into the cost of goods sold.
simple_plot <- ggplot() +
geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"batch price"),y=price_batch,label=recipe_id,fill='Price')) +
geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"batch qty"),y=batch_size,label=recipe_id,fill='Qty')) +
ggtitle('Batch Costs by Recipe') + labs(x='Batch Price ($) and Units Per Batch (#)',y='Total')
simple_plot
simple_plot <- ggplot() +
geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"each(matl)"),y=each_cost,label=recipe_id,fill='Materials')) +
geom_col(data=batch_cost_df, aes(x=paste(recipe_id,"each(labor)"),y=each_wage_cost,label=recipe_id,fill='Labor')) +
ggtitle('Unit Costs by Recipe') + labs(x='Material Cost ($) and Labor Cost ($) per Unit',y='Total')
simple_plot
R has convenient functions for working with excel (xlsx) files. These files make a great way to create an application out of pure R without a database of any kind. The excel files become your application database, enabling your data to be permanently stored and accessed outside the R environment.
# TO read data from an Excel file or Workbook object into a data.frame:
#df <- read.xlsx('name-of-your-excel-file.xlsx')
#To write a data.frame or list of data.frames to an xlsx file:
#?write.xlsx
acct_worksheets <- list(
"General Ledger" = ledger_df,
"Chart of Accounts" = accounts_df,
"Balance Sheet" = accts_total,
"Cashflow Statement" = cashflow_df,
"Income Statement" = income_df
)
write.xlsx(acct_worksheets, file = "Accounting.xlsx", colNames = TRUE, borders = "columns")
recipe_worksheets <- list(
"Recipes" = recipe_df,
"Ingredients" = ingred_df,
"Recipe Parts" = recipeparts_df,
"Recipe Costs" = recipe_cost,
"Units of Measure" = unit_df
)
write.xlsx(recipe_worksheets, file = "Recipes.xlsx", colNames = TRUE, borders = "columns")
prod_worksheets <- list(
"Workers" = worker_df,
"Batches" = batch_cost_df
)
write.xlsx(prod_worksheets, file = "Production.xlsx", colNames = TRUE, borders = "columns")
invoice_worksheets <- list(
"Invoices" = invoice_header_df,
"Invoice Items" = invoice_df,
"Customers" = customer_df
)
write.xlsx(invoice_worksheets, file = "Invoices.xlsx", colNames = TRUE, borders = "columns")
Having saved the files that store all this key data, you may now use Excel to see your ledger, balance sheet and other key reports at any time.
Since this document is focused on generating excel files, we have included links to the excel files generated below. These files act as the permanent database for your accounting project.
If you were to try to use R for long term accounting - you would need more interactivity in your accounting system. You need ways to, for example, add customers, create invoices, and process sales without writing a ton of code every time. You need reusable interfaces.
Check back for a future installment of this e-book to see how these basoic accounting concepts could be transferred to a more practical and re-usable system of double-entry bookkeeping.
This concludes “An Accounting Companion in R Markdown” - Thanks for reading, and please check out our web site for more installments:
In “A Finite Math Companion in R Markdown”, we explore more fully the subjects of finite mathematics including sets, financial formulas, interest and annuities, matrix operations, and more.
In “A Statistics Companion in R Markdown”, we will explore more fully the subjects of probability, statistics, and additional applications of expected value, including subjects of linear modeling and inference. We will look at how to use R to generate histograms, ogives, box plots, and other statistical charts.
In “A Linear Algebra Companion in R Markdown” we will more thoroughly explore the subject of vector spaces, and how these to visualize vectors geometrically using the principles of Linear Algebra. This guide will explore more about the inner product, number spaces, and the concepts of magnitude and orthogonality.
R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
To get the latest version of R, please visit:
Working with R is much easier and faster when you use the RStudio IDE. This is the environment used to create this R Markdown file.
Packages used in this document are available from the Comprehensive R Archive Network (CRAN):
For more info on R packages visit:
This document was generated by RMarkdown using R, R Studio, R Markdown, and LaTeX, and rendered to HTM via the Knit command and KnitR utility.
Additional formatting has been applied to results variables using the kable and kableExtra packages. These code blocks are not shown in the output, for readability.