My favorites | Sign in
Project Home Wiki
Project Information
Members

Status

AppReduce code is not yet released. We plan to release the the code along with the below design doc, so please check back soon!

Objective

AppReduce primarily aims to reduce commercial application license spending. It does this by making license costs more transparent to employees and by providing them with self-service, automatic mechanisms for uninstalling applications. In addition to its primary goals, AppReduce can help an organization meet various secondary goals such as encouraging the use of free applications, relying on cloud-based solutions, simplifying the installed IT footprint, providing a web-based interface to view licenses, etc.

Background

Surely at some point everyone has installed a commercial application, used it a few times, then had it sit idle on their computer. Often such applications have associated licensing costs and therefore money has been spent on something that is not being used; said licenses can cost as little as $5 and as much as $1000 or more. With many licensing agreements customers pay a true-up fee at a fixed interval, meaning they pay based on the number of instances across an organization regardless of which specific employees or computers have the application. In these cases, if a user uninstalls a piece of previously licensed software it typically does not equate to a refund from the software manufacturer; however the next time a different user needs that same application and installs it, a new license does not need to be purchased. The license paid for originally becomes available for reuse because the original user has uninstalled the application.

Furthermore, some manufacturers allow their customers to pay a premium that entitles them to free upgrades to the newest software versions. For example, if such an agreement exists with CompanyX and a user upgrades CompanyX ApplicationY version 2009 to ApplicationY version 2010, a new license would not need to be purchased. In some cases this may also mean one employee could uninstall CompanyX ApplicationY 2009 and a different employee could install CompanyX ApplicationY 2010, and the old 2009 license could be used instead of purchasing a new license for 2010.

Employees may legitimately need commercial software, even when it is used infrequently. However, almost all employees can benefit from an easily accessible reminder of their installed licensed software, whether or not they choose to uninstall any of that software.

Non-Goals

  • AppReduce does not attempt to strip employees of software that is valuable to their productivity.
  • AppReduce does not accommodate for individual license purchases (for example licenses that were purchased for or by a single user and are not part of a larger pool of licenses).
  • AppReduce does not aim to modify software request, approval, or installation workflow/policies.
  • AppReduce is not intended to contribute to an atmosphere of unmeasured, unjustified cost cutting. It would be ill advised to nickel and dime small pools of employees or suggest any changes that don't result in a net licensing savings.

Overview

Upon visiting the AppReduce site an employee see a list of licensed software that is installed on their computer(s), along with the associated licensing costs of these applications. The employee can automatically uninstall application(s) with the click of a button. Statistics regarding cost per user, per group and per overall organization are displayed to provide users with a relative perspective on their license costs.

Infrastructure

AppReduce uses the following infrastructure and services:

  • Corporate Inventory Systems
    • Microsoft® Systems Management Server® or System Center Configuration Manager® (aka SMS or SCCM) SQL Server Database: bulk import of Windows® workstation inventory information. Note: AppReduce works with any inventory system, though it may require some glue code to extract the data from non-SMS/SCCM systems.
    • LDAP, Active Directory, or other employee account DB: employee's department, office, job title, etc.
  • Google App Engine (aka GAE): frontend web hosting and backend datastore.
  • Google Secure Data Connector (aka SDC): secure tunnel between App Engine and corp-hosted APIs.
  • Generic Open RPC Daemon (aka GORD) : accessed via SDC to fetch or manipulate corporate data (i.e. assign automated uninstall packages, real-time host information refreshing, etc).
  • Optional cron server: jobs for alerts, bulk data import from SMS or SCCM, reporting processing, etc.
  • Google Charts API: graphic reports.

Design Chart

General Website Flow

  • Employee navigates to the AppReduce application, hosted on App Engine.
  • AppReduce pulls employee information (department, manager, etc.) from the Employee model.
  • AppReduce loads a list of computers that belong to the user from the Computer model and displays this list to the user along with model information, applications and their licensing costs, etc.
  • If AppReduce finds no computer associated with the employee's username, the employee can easily file a helpdesk ticket to have their computer(s) investigated for eligibility.
    • The ticket template provides a field for manual computer name entry, with graphical instructions on how to get the computer name.
    • The ticket suggests that the helpdesk team member repair the inventory client (i.e. SCCM on Windows® hosts) and notify the user when the machine is present in these systems and thus eligible for AppReduce.
  • If AppReduce finds one or more computers associated with the employee's username, it displays each computer with known licensed applications and costs.
    • Individual application costs are stored in a YAML file, with regular expressions mapping a given application name to a known license.
  • The employee can click "Uninstall" to have SCCM automatically uninstall an application in the background.
    • Clicking "Uninstall" initiates a call via SDC, accessing the appropriate corp API to assign the uninstaller to the machine (i.e. GORD for Windows® hosts running SCCM).
      • All SCCM packages/uninstallers are to be mandatory/forced/silent and therefore require no user interaction, whenever possible.
    • If no automated uninstaller exists for a particular application, a manual uninstall request is sent to the employee via a helpdesk ticket or email.

Data Models

  • Computer
    • All computers uploaded from Corporate Inventory Data, with further employee data aggregated in, and cost preprocessing completed.
    • Properties:
      • applications_pickle = db.BlobProperty()
        • a pickled list of Application objects (container class)
      • cost = db.FloatProperty()
      • host_mtime = db.DateTimeProperty()
      • time the data was last sent to the authoritative data source from the host
      • manufacturer = db.StringProperty()
      • model = db.StringProperty()
      • mtime = db.DateTimeProperty()
        • time the data was last imported (not time updated in authoritative source)
      • name = db.StringProperty()
      • office = db.StringProperty()
      • owner = db.StringProperty()
      • platform = db.StringProperty()
        • basic platform name (i.e. Windows®, Mac®)
      • uuid = db.StringProperty()
        • in "source_uuid" format. (i.e. sms_1234, fooinvsystem_1234).
  • Employee
    • Employee data from corp sources (LDAP, AD, etc), with their summed total cost of all their computers' applications.
    • Properties:
      • cost_center = db.StringProperty()
      • department = db.StringProperty()
      • email_date = db.DateTimeProperty()
      • first_visit = db.DateTimeProperty()
      • last_visit = db.DateTimeProperty()
      • manager = db.StringProperty()
      • mtime = db.DateTimeProperty()
        • time the data was last imported (not time updated in authoritative source)
      • office = db.StringProperty()
      • platforms = db.StringListProperty()
      • total_cost = db.FloatProperty()
      • uid = db.StringProperty()
  • UninstallLogEntry
    • Log of uninstalls/removals via AppReduce.
    • Properties:
      • application_name = db.StringProperty()
      • application_cost = db.FloatProperty()
      • hostname = db.StringProperty()
      • owner_uid = db.StringProperty()
      • os_platform = db.StringProperty()
      • department = db.StringProperty()
      • cost_center = db.StringProperty()
      • reinstalled = db.BooleanProperty(default=False)
        • user reinstalled the application they previously uninstalled
      • ticket_id = db.StringProperty()
        • helpdesk ticket ID
      • time_checked = db.DateTimeProperty()
        • last checked status in inventory sys
      • time_completed = db.DateTimeProperty()
        • confirmed was actually uninstalled
      • time_requested = db.DateTimeProperty(auto_now_add=True)
        • user requested
      • uuid = db.StringProperty()
  • ApplicationCost
    • Average and total application license costs groups by department, cost center, and overall.
    • Properties:
      • average_cost = db.FloatProperty()
      • average_user_cost = db.FloatProperty()
      • cost = db.FloatProperty()
      • group_type = db.StringProperty()
      • group_name = db.StringProperty()
      • name = db.StringProperty()
      • mtime = db.DateTimeProperty()
  • ApplicationCostHistory
    • Weekly historical snapshot of ApplicationCost.
    • Properties are the same as ApplicationCost, but the entity key is "year_weeknumber_group_type_group_name_name" to keep historical weekly costs. Note: App Engine Data Viewer may present the key name as a SHA-1 hex digest representation. We have not investigated whether it does this for reasons relating to long key name length, or as a result of specific characters being present in the key name.
  • GroupedComputerCost
    • Average and total computer costs grouped by department, cost center, and overall.
    • Properties:
      • average_cost = db.FloatProperty()
      • average_user_cost = db.FloatProperty()
      • cost = db.FloatProperty()
      • name = db.StringProperty()
      • type = db.StringProperty()
      • mtime = db.DateTimeProperty()
  • GroupedComputerCostHistory
    • Weekly historical snapshot GroupedComputerCost.
    • Properties are the same as GroupedComputerCost, but the entity key is "year_weeknumber_type_name" to keep historical weekly costs. Note: App Engine Data Viewer may present the key name as a SHA-1 hex digest representation. We have not investigated whether it does this for reasons relating to long key name length, or as a result of specific characters being present in the key name.
  • KeyValueCache
    • Model for generic key/value pair storage. This model can be used to store totals for fast retrieval, including overall uninstall dollars, uninstall count, unique employee participation count, etc.
    • Properties:
      • key name = the key
      • value = db.StringProperty()
      • float_value = db.FloatProperty()
      • int_value = db.IntegerProperty()
    • Below is an example of where this model is used. Row count/sum are not possible with Datastore, but the same results can be accomplished by updating totals every time an uninstall occurs.
def AddUninstallToAlltimeStats(computer, application):
  """Adds application uninstall to the stats cache entities.

  Args:
    computer: a Computer object.
    application: an Application object.
  """
  # Add cost to total_savings.
  total_savings = models.KeyValueCache.get_or_insert(
      'total_savings', float_value=0.0)
  total_savings.float_value += application.cost
  total_savings.put()

  # Add 1 to total_uninstalls count.
  total_uninstalls = models.KeyValueCache.get_or_insert(
      'total_uninstalls', int_value=0)
  total_uninstalls.int_value += 1
  total_uninstalls.put()

  # Add 1 to unique users, but only if user hasn't uninstalled before.
  user_is_unique = models.UninstallLogEntry.all().filter(
      'owner_uid =', computer.owner.uid).get() is None
  if user_is_unique:
    total_unique_user_uninstalls = models.KeyValueCache.get_or_insert(
       'total_unique_user_uninstalls', int_value=0)
    total_unique_user_uninstalls.int_value += 1
    total_unique_user_uninstalls.put()

Data Collection / Bulkload

A Python script running on a regular basis (cron) fetches computer and employee information from Corporate Inventory Systems to import to AppReduce. The following models are filled on each bulkload execution: Computer, Googler, ApplicationCost, ApplicationCostHistory, GroupedComputerCost, and GroupedComputerCostHistory models.

The following steps are performed:

  1. Data is fetched and aggregated from whatever inventory sources may exist.
    • AppReduce is equipped to handle different data sources for Windows® versus Mac® computers.
  2. Applications on every computer are matched to known licensed applications by string application name using regular expressions.
    • This matching isn't always straightforward as many applications have several display names, different naming conventions based on version, etc.
  3. Individual computer, total owner, and group (department, etc) costs and averages are calculated.
  4. All pre-processed data is pickled using Python's cPickle library and written to a file on disk.
    • Bulkloader natively supports CSV file format, but AppReduce uses pickled data to more easily handle multi-dimensional data.
      • i.e. a Computer has an Owner who has a Department, or a Computer has an Application that has a cost.
  5. An extended bulkloader implementation reads the pickle file and uploads the data to AppReduce Datastore.
    • Bulkloader was extended to read from the pickle file as opposed to CSV, but all other functionality is native.

App Engine TaskQueue Usage

GAE TaskQueues are used to perform background jobs. AppReduce uses TaskQueues for many operations that generally all follow the same model: do as much as possible in the given App Engine request time limit, catch the DeadlineExceededError exception, defer the remainder of the job to a new Task (a new request). Example pseudo code:

def Foo():
  try:
      ... code which may take longer than the request limit to execute ...
  except runtime.DeadlineExceededError:
    logging.info('Request limit hit. Deferring Foo to future Task...')
    deferred.defer(Foo, _name='foo-task-name')
    return
  logging.info('Foo fully completed!')

A more specific example of how this is done in AppReduce can be seen in the data import model cleanup. Because the data is authoritative in Corporate Inventory Systems, not AppReduce, entities that didn't get imported after each import cycle need to be deleted. This determination is based on the assumption that the reason these entities were not imported is that they've likely been deleted or disabled in the authoritative data source. AppReduce handles this deletion as follows: the mtime value is shared across all entities updated in a given data import cycle, so after the cycle it's safe to delete any entities where mtime is less than the newest mtime in the Kind. See the TruncateModelByMtime code below:

import datetime
import logging
import models
from google.appengine import runtime
from google.appengine.ext import db
from google.appengine.ext import deferred

BATCH_DELETE_SIZE = 200

def TruncateModelByMtime(model_name=None):
  """Truncates a given model for all entries older than the latest mtime.

  Note: This only works on models with an "mtime" field.  Furthermore, there is
  no padding or horizon time, so if any mtime is even a second before the
  latest then it gets deleted. When bulkloading models to be used with this
  function, ensure all entities share a common mtime.

  Args:
    model_name: str model name to truncate.
  """
  if model_name is None:
    logging.error('Truncate Model: model is None')
    return
  elif model_name.endswith('History'):
    logging.error('Truncate Model: History tables is not allowed.')
    return
  elif not hasattr(models, model_name):
    logging.error('Truncate Model: model does not exist %s', model_name)
    return

  model = getattr(models, model_name)
  if not hasattr(model, 'mtime'):
    logging.error(
        'Truncate Model: model "%s" does not have field "mtime".', model_name)
    return

  logging.info('Truncate Model: Truncating %s', model_name)
  # Get a single entity with the latest mtime from the model.
  desc_mtime_ent = db.Query(model).order('-mtime').get()
  latest_mtime = desc_mtime_ent.mtime
  logging.info('Deleting entities with mtime < %s', latest_mtime)
  # Get the keys of all entities with mtime < latest mtime.
  keys = db.Query(model, keys_only=True).filter('mtime <', latest_mtime)
  num_deleted = 0
  try:
    to_delete = []
    for key in keys:
      to_delete.append(key)
      if len(to_delete) == BATCH_DELETE_SIZE:
        db.delete(to_delete)
        num_deleted += BATCH_DELETE_SIZE
        to_delete = []
    # Delete remaining keys; if to_delete didn't reach batch size in last loop.
    if to_delete:
      db.delete(to_delete)
      num_deleted += len(to_delete)
  except runtime.DeadlineExceededError:
    # Run this function again in a new TaskQueue entry in case more rows exist.
    logging.info('Deleted %d entries.  Deferring...', num_deleted)
    now_str = datetime.datetime.utcnow().strftime('%Y-%m-%d-%H-%M-%S')
    name = 'truncate-model-%s-%s' % (model_name, now_str)
    deferred.defer(
        TruncateModelByMtime, model_name=model_name, _name=name)
    return
  logging.info('Deleted %d entries.  Complete!', num_deleted)

App Engine Cron Usage

GAE Crons are used (typically with TaskQueue/Deferred) to do the following:

  • Verify success of uninstall requests.
    • Comparing current inventory state to uninstall requests identifies computers that still have applications with pending uninstall requests.
  • Send email alerts or reports.
    • Regular jobs can easily be written to alert admins or email them various reports.
  • Promotional Emails.
    • Regularly (quarterly, annually, etc) send emails to employees with high application costs and/or low usage stats, informing them of their current costs and inviting them to visit AppReduce.

Scalability

Virtually all read operations are cached in our App Engine datastore, and many in memcache as well. GORD is only hit via SDC for write operations, and on-demand data refresh requests. AppReduce can handle a very high load in the context of an internal/corporate application. During the Google-internal launch the application sustained 20 QPS for an extended period of time.

Below is a code snippet to fetch grouped data from the Datastore, which is notable for two reasons: 1) it's fairly expensive and therefore memcache is used, 2) In Datastore != (not equal) queries are slower than Python looping in certain circumstances, as != queries are actually two separate queries under the hood.

def GetMainReportsData():
  """Gets data needed to load the main reports page.

  Returns:
    Tuple. Sorted lists of departments, cost centers, applications, and quarters
    where uninstalls have occured.
  """
  main_report_data = memcache.get('main_report_data')
  if main_report_data is not None:
    return main_report_data

  all_departments = models.GroupedComputerCost.all().filter(
      'type = ', 'department').order('name')
  all_cost_centers = models.GroupedComputerCost.all().filter(
      'type = ', 'cost_center').order('-cost').fetch(10)  # Only top 10.
  all_applications = models.ApplicationCost.all().filter(
      'group_type = ', 'total').order('name')

  # Doing a DataStore != query is less efficient than doing here.
  departments = [department.name for department in all_departments
                 if department.name != 'Unknown']
  # Doing a DataStore != query is less efficient than doing here.
  cost_centers = [cost_center.name for cost_center in all_cost_centers
                  if cost_center.name != 'Unknown']
  applications = [application.name for application in all_applications]

  # Generate years/quarters for the RoI reports.
  uninstalls = common.GetUninstallsForPastDays()
  quarters = set()
  for u in uninstalls:
    quarter = misc.GetQuarter(u.time_requested.month)
    quarter_string = '%d Q%d' % (u.time_requested.year, quarter)
    quarters.add(quarter_string)
  data = (departments, cost_centers, applications, sorted(list(quarters)))
  memcache.set('main_report_data', data, common.DEFAULT_MEMCACHE_SECS)
  return data

Work Estimates

Estimated timeline:

  • 3-5 months design and development
  • 2 weeks trial/pilot
  • 2 weeks global rollout and monitoring

During and after rollout the service needs to be monitored for usage, compared to goals, and tweaked with new marketing ideas/plans for better coverage. Due to this, an unknown but fairly low amount of work may be needed past launch. As new applications and versions are released, new unattended uninstallations also need to be created and added to AppReduce; however, this should be fairly rare/infrequent.

Caveats

A simple dashboard showing application license spend without the automated uninstall packages is not ideal for a number of reasons:

  • Mac® Finder -> Applications, or Windows® -> Control Panel -> Add/Remove Programs contain unfiltered lists of every application on the machine.
    • By providing employees with a website we show them not only a filtered list of which applications are costing the company money, but also the associated costs.
    • Less searching and clicking: 1) Load webpage; 2) Click Uninstall.
  • Administrative privileges are required to uninstall applications via Windows® Add/Remove Programs, which is access many employees at many companies do not have.

AppReduce highly depends on the data integrity of SCCM. There are a number of scenarios where data may be inaccurate:

  • SCCM looks closely at the user logged into a machine at the time of the hardware inventory scan. If a non-owner, say a helpdesk employee, is signed into the machine at this time, it appears to AppReduce that the machine belongs to the helpdesk employee. These scans are generally taken daily, so this problem corrects itself fairly quickly.
    • AppReduce does not plan to do anything for any such reports, as the auto-correct timeframe is quite small.
  • SCCM has no way to determine whether a machine that hasn't reported back in a given amount of time has been renamed, destroyed, returned to IT, etc. For that reason, it takes ~14 days of no report before these systems consider the host stale. Thus AppReduce may show an employee his or her old computers for a short amount of time, potentially artificially increasing their (and their cost center/department/etc's) displayed spend.
    • AppReduce provides employees with a "report a problem with this computer" link, which allows them to file a helpdesk ticket saying the computer has been returned. With this information helpdesk can manually delete the host from SCCM.
  • Every so often clients for SCCM are broken in a state where they do not report back to their appropriate servers. In this case, the machine is essentially non-existent in AppReduce's eyes. This can potentially cause an under-report of an employee's (and their cost center/department/etc's) total spend.
    • AppReduce provides employees with a form to submit for computers that are in their possession that are not showing up on the website. This ticket instructs helpdesk to verify the health of the inventory client; the computer should appear in AppReduce a day or two after it is fixed by helpdesk.
    • When broken clients are fixed, the AppReduce total license costs rise with any licenses on the recently fixed clients. Because this adjustment uncovers costs the company was potentially previously unaware of, it can also be counter-productive to our measurements of AppReduce savings vs actual quarterly spend.
  • Personal software or software installed with a non-corporate key may be included in SCCM's data. However, SCCM may not discriminate between serial numbers/keys, so a company may have paid for this software in quarterly true-ups as well, unfortunately.

Microsoft, Windows, System Center Configuration Manager, System Management Server, SQL Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

Mac either registered trademarks or trademarks of Apple Inc. in the United States and/or other countries.

Powered by Google Project Hosting