Python to write multiple dataframes and highlight rows inside an excel fileReading columns and rows in a .csv...

What is the smallest molar volume?

What really causes series inductance of capacitors?

Protagonist constantly has to have long words explained to her. Will this get tedious?

Whats happened with already installed GNOME apps if I install and run KDE to Ubuntu 18.04?

Can a Hydra make multiple opportunity attacks at once?

What is the reward?

SQL Server Service does not start automatically after system restart

Is it possible to narrate a novel in a faux-historical style without alienating the reader?

Is there a configuration of the 8-puzzle where locking a tile makes it harder?

Taking an academic pseudonym?

Does しかたない imply disappointment?

Excluding or including by awk

UK visa start date and Flight Depature Time

typeof generic and casted type

What sort of grammatical construct is ‘Quod per sortem sternit fortem’?

How can guns be countered by melee combat without raw-ability or exceptional explanations?

Distribution of sum of independent exponentials with random number of summands

Sets which are both Sum-free and Product-free.

How do I make my single-minded character more interested in the main story?

What is formjacking?

Why do single electrical receptacles exist?

70s or 80s B-movie about aliens in a family's television, fry the house cat and trap the son inside the TV

What does @ mean in a hostname in DNS configuration?

What's the meaning of #0?

Python to write multiple dataframes and highlight rows inside an excel file

Reading columns and rows in a .csv fileExcel Mapping ModuleSplit excel file with multiple sheets, manipulate the data and create final out fileWrite binary save file in PythonSum over selected numpy.ndarray column and write to a filePython read/write pickled fileWrite millions of lines to a file - Python, Dataframes and RedisCopy Excel file and make multiple daily versionsFastest way to write large CSV file in pythonInvert rows and columns in an Excel file

I am trying to write multiple dataframes into an excel file one after another with the same logic. Nothing changes for any of the data frame, except the number of columns or number of records. The functions are still the same.

For example,

writer = pd.ExcelWriter(OutputName)   

Emp_ID_df.to_excel(writer,'Sheet1',index = False)

Visa_df.to_excel(writer,'Sheet2',index = False)

custom_df_1.to_excel(writer,'Sheet3',index = False)

writer.save()

Now then, once written I am trying to highlight a row if any of the Boolean column has False value in it. There is no library in anacondas that I am aware of can highlight cells in excel. So I am going for the native one.

from win32com.client import Dispatch #to work with excel files

Pre_Out_df_ncol = Emp_ID_df.shape[1]

Pre_Out_df_nrow = Emp_ID_df.shape[0]

RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()



Pre_Out_df_ncol_2 = Visa_df.shape[1]

Pre_Out_df_nrow_2 = Visa_df.shape[0]

RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)

arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()



Pre_Out_df_ncol_3 = custom_df_1.shape[1]

Pre_Out_df_nrow_3 = custom_df_1.shape[0]

RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)

arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()



xlApp = Dispatch("Excel.Application")

xlwb1 = xlApp.Workbooks.Open(OutputName)

xlApp.visible = False

print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



for i in range(len(ReqRows)):

   j = ReqRows[i] + 1

   xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet1').Columns.AutoFit()                





for i in range(len(ReqRows_2)):

   j = ReqRows_2[i] + 1

   xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet2').Columns.AutoFit()





for i in range(len(ReqRows_3)):

   j = ReqRows_3[i] + 1

   xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet3').Columns.AutoFit()

At last, I am changing the name of the sheet

xlwb1.Sheets("Sheet1").Name = "XXXXA"

xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"

xlwb1.Sheets("Sheet3").Name = "SADAD"

xlwb1.Save()

Now there are a few problems here

1) My number of dataframe increases and which means I am writing up the same code again and again.

2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.

Kindly help me with this.

edited 2 hours ago

Mathias Ettinger

24.7k33184

asked 2 hours ago

Sid29

1184

New contributor

$begingroup$
Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
$endgroup$
– Graipher
2 hours ago

add a comment |

For example,

writer = pd.ExcelWriter(OutputName)   

Emp_ID_df.to_excel(writer,'Sheet1',index = False)

Visa_df.to_excel(writer,'Sheet2',index = False)

custom_df_1.to_excel(writer,'Sheet3',index = False)

writer.save()

from win32com.client import Dispatch #to work with excel files

Pre_Out_df_ncol = Emp_ID_df.shape[1]

Pre_Out_df_nrow = Emp_ID_df.shape[0]

RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()



Pre_Out_df_ncol_2 = Visa_df.shape[1]

Pre_Out_df_nrow_2 = Visa_df.shape[0]

RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)

arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()



Pre_Out_df_ncol_3 = custom_df_1.shape[1]

Pre_Out_df_nrow_3 = custom_df_1.shape[0]

RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)

arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()



xlApp = Dispatch("Excel.Application")

xlwb1 = xlApp.Workbooks.Open(OutputName)

xlApp.visible = False

print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



for i in range(len(ReqRows)):

   j = ReqRows[i] + 1

   xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet1').Columns.AutoFit()                





for i in range(len(ReqRows_2)):

   j = ReqRows_2[i] + 1

   xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet2').Columns.AutoFit()





for i in range(len(ReqRows_3)):

   j = ReqRows_3[i] + 1

   xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet3').Columns.AutoFit()

At last, I am changing the name of the sheet

xlwb1.Sheets("Sheet1").Name = "XXXXA"

xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"

xlwb1.Sheets("Sheet3").Name = "SADAD"

xlwb1.Save()

Now there are a few problems here

1) My number of dataframe increases and which means I am writing up the same code again and again.

2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.

Kindly help me with this.

edited 2 hours ago

Mathias Ettinger

24.7k33184

asked 2 hours ago

Sid29

1184

New contributor

$begingroup$
Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
$endgroup$
– Graipher
2 hours ago

add a comment |

For example,

writer = pd.ExcelWriter(OutputName)   

Emp_ID_df.to_excel(writer,'Sheet1',index = False)

Visa_df.to_excel(writer,'Sheet2',index = False)

custom_df_1.to_excel(writer,'Sheet3',index = False)

writer.save()

from win32com.client import Dispatch #to work with excel files

Pre_Out_df_ncol = Emp_ID_df.shape[1]

Pre_Out_df_nrow = Emp_ID_df.shape[0]

RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()



Pre_Out_df_ncol_2 = Visa_df.shape[1]

Pre_Out_df_nrow_2 = Visa_df.shape[0]

RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)

arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()



Pre_Out_df_ncol_3 = custom_df_1.shape[1]

Pre_Out_df_nrow_3 = custom_df_1.shape[0]

RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)

arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()



xlApp = Dispatch("Excel.Application")

xlwb1 = xlApp.Workbooks.Open(OutputName)

xlApp.visible = False

print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



for i in range(len(ReqRows)):

   j = ReqRows[i] + 1

   xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet1').Columns.AutoFit()                





for i in range(len(ReqRows_2)):

   j = ReqRows_2[i] + 1

   xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet2').Columns.AutoFit()





for i in range(len(ReqRows_3)):

   j = ReqRows_3[i] + 1

   xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet3').Columns.AutoFit()

At last, I am changing the name of the sheet

xlwb1.Sheets("Sheet1").Name = "XXXXA"

xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"

xlwb1.Sheets("Sheet3").Name = "SADAD"

xlwb1.Save()

Now there are a few problems here

1) My number of dataframe increases and which means I am writing up the same code again and again.

2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.

Kindly help me with this.

edited 2 hours ago

Mathias Ettinger

24.7k33184

asked 2 hours ago

Sid29

1184

New contributor

For example,

writer = pd.ExcelWriter(OutputName)   

Emp_ID_df.to_excel(writer,'Sheet1',index = False)

Visa_df.to_excel(writer,'Sheet2',index = False)

custom_df_1.to_excel(writer,'Sheet3',index = False)

writer.save()

from win32com.client import Dispatch #to work with excel files

Pre_Out_df_ncol = Emp_ID_df.shape[1]

Pre_Out_df_nrow = Emp_ID_df.shape[0]

RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

arr = (Emp_ID_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows = np.arange(1, len(Emp_ID_df)+ 1)[arr].tolist()



Pre_Out_df_ncol_2 = Visa_df.shape[1]

Pre_Out_df_nrow_2 = Visa_df.shape[0]

RequiredCol_let_2 = colnum_num_string(Pre_Out_df_ncol_2)

arr_2 = (Visa_df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_2 = np.arange(1, len(Visa_df)+ 1)[arr_2].tolist()



Pre_Out_df_ncol_3 = custom_df_1.shape[1]

Pre_Out_df_nrow_3 = custom_df_1.shape[0]

RequiredCol_let_3 = colnum_num_string(Pre_Out_df_ncol_3)

arr_3 = (custom_df_1.select_dtypes(include=[bool])).eq(False).any(axis=1).values

ReqRows_3 = np.arange(1, len(custom_df_1)+ 1)[arr_3].tolist()



xlApp = Dispatch("Excel.Application")

xlwb1 = xlApp.Workbooks.Open(OutputName)

xlApp.visible = False

print ("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



for i in range(len(ReqRows)):

   j = ReqRows[i] + 1

   xlwb1.sheets('Sheet1').Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet1').Columns.AutoFit()                





for i in range(len(ReqRows_2)):

   j = ReqRows_2[i] + 1

   xlwb1.sheets('Sheet2').Range('A' + str(j) + ":" + RequiredCol_let_2 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet2').Columns.AutoFit()





for i in range(len(ReqRows_3)):

   j = ReqRows_3[i] + 1

   xlwb1.sheets('Sheet3').Range('A' + str(j) + ":" + RequiredCol_let_3 + str(j)).Interior.ColorIndex = 6

xlwb1.sheets('Sheet3').Columns.AutoFit()

At last, I am changing the name of the sheet

xlwb1.Sheets("Sheet1").Name = "XXXXA"

xlwb1.Sheets("Sheet2").Name = "XXXXASDAD"

xlwb1.Sheets("Sheet3").Name = "SADAD"

xlwb1.Save()

Now there are a few problems here

1) My number of dataframe increases and which means I am writing up the same code again and again.

2) The highlighting process works but it is too slow. Sometimes 90 % of the rows needs to be highlighted. There are 1 million rows and doing them one after another takes 35 minutes.

Kindly help me with this.

python python-3.x pandas

edited 2 hours ago

Mathias Ettinger

24.7k33184

asked 2 hours ago

Sid29

1184

New contributor

edited 2 hours ago

Mathias Ettinger

24.7k33184

asked 2 hours ago

Sid29

1184

New contributor

edited 2 hours ago

Mathias Ettinger

24.7k33184

edited 2 hours ago

Mathias Ettinger

24.7k33184

edited 2 hours ago

Mathias Ettinger

24.7k33184

asked 2 hours ago

Sid29

1184

New contributor

asked 2 hours ago

Sid29

1184

asked 2 hours ago

Sid29

1184

New contributor

Sid29 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
$endgroup$
– Graipher
2 hours ago

add a comment |

$begingroup$
Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?
$endgroup$
– Graipher
2 hours ago

Did you have a look at xlsxwriter.readthedocs.io/example_pandas_conditional.html and xlsxwriter.readthedocs.io/working_with_conditional_formats.html ?

– Graipher
2 hours ago

add a comment |

2 Answers
2

active

oldest

votes

First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).

The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:

from win32com.client import Dispatch

import pandas as pd



def highlight_false(df):

    arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    return np.arange(1, len(df) + 1)[arr].tolist()





def color_rows(sheet, rows, col):

    for row in rows:

        cells = f"A{row+1}:{col}{row+1}"

        sheet.Range(cells).Interior.ColorIndex = 6

    sheet.Columns.AutoFit()





if __name__ == "__main__":

    Emp_ID_df = ...

    writer = pd.ExcelWriter(OutputName)

    Emp_ID_df.to_excel(writer, 'Sheet1', index=False)



    excel_app = Dispatch("Excel.Application")

    workbook = excel_app.Workbooks.Open(OutputName)

    excel_app.visible = False



    sheet_names = ["Sheet1"]

    dfs = [Emp_ID_df]

    for sheet_name, df in zip(sheet_names, dfs):

        sheet = workbook.Sheets(sheet)

        rows = highlight_false(df)

        col = colnum_num_string(df.shape[1])

        color_rows(sheet, rows, col)

However, there is an even easier method using xlsxwriter:

import pandas as pd



Emp_ID_df = ...



writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')

Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)



workbook = writer.book

format1 = workbook.add_format({'bg_color': '#FFC7CE',

                               'font_color': '#9C0006'})

dfs = [Emp_ID_df]

for df, sheet in zip(dfs, writer.sheets.values()):

    nrow, ncol = df.shape

    col_letter = colnum_num_string(ncol + 1)

    cells = f"A1:{col_letter}{nrow+1}"

    sheet.conditional_format(cells, {"type": "cell",

                                     "criteria": "==",

                                     "value": 0,

                                     "format": format1})

writer.save()

You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.

In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.

edited 1 hour ago

answered 1 hour ago

Graipher

24.9k53587

add a comment |

1) My number of dataframe increases and which means I am writing up
the same code again and again

You're repeating a lot of code which is essentially doing the same thing with just a couple of variables.
I think for this you should wrap it in a function.
I've taken your code and put it inside a function:

def highlight_false_cells(sheetName, dataFrame, OutputName):

    Pre_Out_df_ncol = dataFrame.shape[1]

    Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.

    RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

    arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()



    xlApp = Dispatch("Excel.Application")

    xlwb1 = xlApp.Workbooks.Open(OutputName)

    xlApp.visible = False

    print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



    for i in range(len(ReqRows)):

        j = ReqRows[i] + 1

        xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

    xlwb1.sheets(sheetName).Columns.AutoFit()



    xlwb1.Save()

To call this for your dataframes:

highlight_false_cells("XXXXA", Emp_ID_df, OutputName)

highlight_false_cells("XXXXASDAD", Visa_df, OutputName)

highlight_false_cells("SADAD", custom_df_1, OutputName)

I'm not really familiar with the packages you're using, so there may be a mistake in logic within there. However hopefully it gives you a good example how to take your work and put it into a function.

I'd also recommend looking into a programming principle called "DRY", which stands for "Don't Repeat Yourself". If you can learn to spot areas like this where you have a lot of repeated lines then it will make stuff easier I believe.

answered 1 hour ago

cphilip

262

New contributor

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Sid29 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214042%2fpython-to-write-multiple-dataframes-and-highlight-rows-inside-an-excel-file%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).

The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:

from win32com.client import Dispatch

import pandas as pd



def highlight_false(df):

    arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    return np.arange(1, len(df) + 1)[arr].tolist()





def color_rows(sheet, rows, col):

    for row in rows:

        cells = f"A{row+1}:{col}{row+1}"

        sheet.Range(cells).Interior.ColorIndex = 6

    sheet.Columns.AutoFit()





if __name__ == "__main__":

    Emp_ID_df = ...

    writer = pd.ExcelWriter(OutputName)

    Emp_ID_df.to_excel(writer, 'Sheet1', index=False)



    excel_app = Dispatch("Excel.Application")

    workbook = excel_app.Workbooks.Open(OutputName)

    excel_app.visible = False



    sheet_names = ["Sheet1"]

    dfs = [Emp_ID_df]

    for sheet_name, df in zip(sheet_names, dfs):

        sheet = workbook.Sheets(sheet)

        rows = highlight_false(df)

        col = colnum_num_string(df.shape[1])

        color_rows(sheet, rows, col)

However, there is an even easier method using xlsxwriter:

import pandas as pd



Emp_ID_df = ...



writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')

Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)



workbook = writer.book

format1 = workbook.add_format({'bg_color': '#FFC7CE',

                               'font_color': '#9C0006'})

dfs = [Emp_ID_df]

for df, sheet in zip(dfs, writer.sheets.values()):

    nrow, ncol = df.shape

    col_letter = colnum_num_string(ncol + 1)

    cells = f"A1:{col_letter}{nrow+1}"

    sheet.conditional_format(cells, {"type": "cell",

                                     "criteria": "==",

                                     "value": 0,

                                     "format": format1})

writer.save()

You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.

In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.

edited 1 hour ago

answered 1 hour ago

Graipher

24.9k53587

add a comment |

First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).

The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:

from win32com.client import Dispatch

import pandas as pd



def highlight_false(df):

    arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    return np.arange(1, len(df) + 1)[arr].tolist()





def color_rows(sheet, rows, col):

    for row in rows:

        cells = f"A{row+1}:{col}{row+1}"

        sheet.Range(cells).Interior.ColorIndex = 6

    sheet.Columns.AutoFit()





if __name__ == "__main__":

    Emp_ID_df = ...

    writer = pd.ExcelWriter(OutputName)

    Emp_ID_df.to_excel(writer, 'Sheet1', index=False)



    excel_app = Dispatch("Excel.Application")

    workbook = excel_app.Workbooks.Open(OutputName)

    excel_app.visible = False



    sheet_names = ["Sheet1"]

    dfs = [Emp_ID_df]

    for sheet_name, df in zip(sheet_names, dfs):

        sheet = workbook.Sheets(sheet)

        rows = highlight_false(df)

        col = colnum_num_string(df.shape[1])

        color_rows(sheet, rows, col)

However, there is an even easier method using xlsxwriter:

import pandas as pd



Emp_ID_df = ...



writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')

Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)



workbook = writer.book

format1 = workbook.add_format({'bg_color': '#FFC7CE',

                               'font_color': '#9C0006'})

dfs = [Emp_ID_df]

for df, sheet in zip(dfs, writer.sheets.values()):

    nrow, ncol = df.shape

    col_letter = colnum_num_string(ncol + 1)

    cells = f"A1:{col_letter}{nrow+1}"

    sheet.conditional_format(cells, {"type": "cell",

                                     "criteria": "==",

                                     "value": 0,

                                     "format": format1})

writer.save()

You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.

In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.

edited 1 hour ago

answered 1 hour ago

Graipher

24.9k53587

add a comment |

First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).

The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:

from win32com.client import Dispatch

import pandas as pd



def highlight_false(df):

    arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    return np.arange(1, len(df) + 1)[arr].tolist()





def color_rows(sheet, rows, col):

    for row in rows:

        cells = f"A{row+1}:{col}{row+1}"

        sheet.Range(cells).Interior.ColorIndex = 6

    sheet.Columns.AutoFit()





if __name__ == "__main__":

    Emp_ID_df = ...

    writer = pd.ExcelWriter(OutputName)

    Emp_ID_df.to_excel(writer, 'Sheet1', index=False)



    excel_app = Dispatch("Excel.Application")

    workbook = excel_app.Workbooks.Open(OutputName)

    excel_app.visible = False



    sheet_names = ["Sheet1"]

    dfs = [Emp_ID_df]

    for sheet_name, df in zip(sheet_names, dfs):

        sheet = workbook.Sheets(sheet)

        rows = highlight_false(df)

        col = colnum_num_string(df.shape[1])

        color_rows(sheet, rows, col)

However, there is an even easier method using xlsxwriter:

import pandas as pd



Emp_ID_df = ...



writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')

Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)



workbook = writer.book

format1 = workbook.add_format({'bg_color': '#FFC7CE',

                               'font_color': '#9C0006'})

dfs = [Emp_ID_df]

for df, sheet in zip(dfs, writer.sheets.values()):

    nrow, ncol = df.shape

    col_letter = colnum_num_string(ncol + 1)

    cells = f"A1:{col_letter}{nrow+1}"

    sheet.conditional_format(cells, {"type": "cell",

                                     "criteria": "==",

                                     "value": 0,

                                     "format": format1})

writer.save()

You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.

In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.

edited 1 hour ago

answered 1 hour ago

Graipher

24.9k53587

First, starting from your code, you should realize that you are repeating yourself, three times. This goes against the principle Don't repeat Yourself (DRY).

The only real difference between processing your three sheets are their name and the underlying dataframe, so you could make this into two functions:

from win32com.client import Dispatch

import pandas as pd



def highlight_false(df):

    arr = (df.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    return np.arange(1, len(df) + 1)[arr].tolist()





def color_rows(sheet, rows, col):

    for row in rows:

        cells = f"A{row+1}:{col}{row+1}"

        sheet.Range(cells).Interior.ColorIndex = 6

    sheet.Columns.AutoFit()





if __name__ == "__main__":

    Emp_ID_df = ...

    writer = pd.ExcelWriter(OutputName)

    Emp_ID_df.to_excel(writer, 'Sheet1', index=False)



    excel_app = Dispatch("Excel.Application")

    workbook = excel_app.Workbooks.Open(OutputName)

    excel_app.visible = False



    sheet_names = ["Sheet1"]

    dfs = [Emp_ID_df]

    for sheet_name, df in zip(sheet_names, dfs):

        sheet = workbook.Sheets(sheet)

        rows = highlight_false(df)

        col = colnum_num_string(df.shape[1])

        color_rows(sheet, rows, col)

However, there is an even easier method using xlsxwriter:

import pandas as pd



Emp_ID_df = ...



writer = pd.ExcelWriter(OutputName, engine='xlsxwriter')

Emp_ID_df.to_excel(writer, sheet_name="Sheet1", index=False)



workbook = writer.book

format1 = workbook.add_format({'bg_color': '#FFC7CE',

                               'font_color': '#9C0006'})

dfs = [Emp_ID_df]

for df, sheet in zip(dfs, writer.sheets.values()):

    nrow, ncol = df.shape

    col_letter = colnum_num_string(ncol + 1)

    cells = f"A1:{col_letter}{nrow+1}"

    sheet.conditional_format(cells, {"type": "cell",

                                     "criteria": "==",

                                     "value": 0,

                                     "format": format1})

writer.save()

You might have to ensure that the sheets do not get out of snyc from the data frames, or just keep track of what name you save each dataframe to.

In addition I used Python's official style-guide, PEP8, which recommends using lower_case for functions and variables as well as renaming your variables so they are a lot clearer.

edited 1 hour ago

answered 1 hour ago

Graipher

24.9k53587

edited 1 hour ago

answered 1 hour ago

Graipher

24.9k53587

answered 1 hour ago

Graipher

24.9k53587

answered 1 hour ago

Graipher

24.9k53587

add a comment |

1) My number of dataframe increases and which means I am writing up
the same code again and again

def highlight_false_cells(sheetName, dataFrame, OutputName):

    Pre_Out_df_ncol = dataFrame.shape[1]

    Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.

    RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

    arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()



    xlApp = Dispatch("Excel.Application")

    xlwb1 = xlApp.Workbooks.Open(OutputName)

    xlApp.visible = False

    print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



    for i in range(len(ReqRows)):

        j = ReqRows[i] + 1

        xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

    xlwb1.sheets(sheetName).Columns.AutoFit()



    xlwb1.Save()

To call this for your dataframes:

highlight_false_cells("XXXXA", Emp_ID_df, OutputName)

highlight_false_cells("XXXXASDAD", Visa_df, OutputName)

highlight_false_cells("SADAD", custom_df_1, OutputName)

answered 1 hour ago

cphilip

262

New contributor

add a comment |

1) My number of dataframe increases and which means I am writing up
the same code again and again

def highlight_false_cells(sheetName, dataFrame, OutputName):

    Pre_Out_df_ncol = dataFrame.shape[1]

    Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.

    RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

    arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()



    xlApp = Dispatch("Excel.Application")

    xlwb1 = xlApp.Workbooks.Open(OutputName)

    xlApp.visible = False

    print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



    for i in range(len(ReqRows)):

        j = ReqRows[i] + 1

        xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

    xlwb1.sheets(sheetName).Columns.AutoFit()



    xlwb1.Save()

To call this for your dataframes:

highlight_false_cells("XXXXA", Emp_ID_df, OutputName)

highlight_false_cells("XXXXASDAD", Visa_df, OutputName)

highlight_false_cells("SADAD", custom_df_1, OutputName)

answered 1 hour ago

cphilip

262

New contributor

add a comment |

1) My number of dataframe increases and which means I am writing up
the same code again and again

def highlight_false_cells(sheetName, dataFrame, OutputName):

    Pre_Out_df_ncol = dataFrame.shape[1]

    Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.

    RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

    arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()



    xlApp = Dispatch("Excel.Application")

    xlwb1 = xlApp.Workbooks.Open(OutputName)

    xlApp.visible = False

    print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



    for i in range(len(ReqRows)):

        j = ReqRows[i] + 1

        xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

    xlwb1.sheets(sheetName).Columns.AutoFit()



    xlwb1.Save()

To call this for your dataframes:

highlight_false_cells("XXXXA", Emp_ID_df, OutputName)

highlight_false_cells("XXXXASDAD", Visa_df, OutputName)

highlight_false_cells("SADAD", custom_df_1, OutputName)

answered 1 hour ago

cphilip

262

New contributor

1) My number of dataframe increases and which means I am writing up
the same code again and again

def highlight_false_cells(sheetName, dataFrame, OutputName):

    Pre_Out_df_ncol = dataFrame.shape[1]

    Pre_Out_df_nrow = dataFrame.shape[0] # Is this required? It doesn't look to be used.

    RequiredCol_let = colnum_num_string(Pre_Out_df_ncol)

    arr = (dataFrame.select_dtypes(include=[bool])).eq(False).any(axis=1).values

    ReqRows = np.arange(1, len(dataFrame) + 1)[arr].tolist()



    xlApp = Dispatch("Excel.Application")

    xlwb1 = xlApp.Workbooks.Open(OutputName)

    xlApp.visible = False

    print("n...Highlighting the Output File at " + datetime.now().strftime('%Y-%m-%d %H:%M:%S'))



    for i in range(len(ReqRows)):

        j = ReqRows[i] + 1

        xlwb1.sheets(sheetName).Range('A' + str(j) + ":" + RequiredCol_let + str(j)).Interior.ColorIndex = 6

    xlwb1.sheets(sheetName).Columns.AutoFit()



    xlwb1.Save()

To call this for your dataframes:

highlight_false_cells("XXXXA", Emp_ID_df, OutputName)

highlight_false_cells("XXXXASDAD", Visa_df, OutputName)

highlight_false_cells("SADAD", custom_df_1, OutputName)

answered 1 hour ago

cphilip

262

New contributor

answered 1 hour ago

cphilip

262

New contributor

answered 1 hour ago

cphilip

262

answered 1 hour ago

cphilip

262

New contributor

cphilip is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

Sid29 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sid29 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ygthkb